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A fact has a peculiar and intricate structure. It belongs 
worlds, the world of objects and events, and the world of hun 
course. Facts are invisible and inaudible. They can not be burned in 
furnaces or used to roof houses, or shot from big guns. 

Objects and events are not facts; they are merely objects and events. 
They are not facts until they are described by persons. And it is in the 
nature of that description that the quintessence of fact lies. Only when 
an event has been given a very specific kind of description does it be- 
come a fact. 

When we say, ‘Tet's get down to the facts,'' what we are saying is 
much more than that we should look at or listen to or smell or touch 
real objects, or that we should all observe an event. What we are really 
proposing is that we all try to find certain statements on which we can 
all agree. Facts are the basis of human cooperation. 

A fact is an event so described that any observer will agree to the descrip- 
tion. There are, of course, no facts that meet this too general require- 
ment. We are satisfied — ^we have established our fact — if any observer 
within the circle of persons with whom we discuss events will agree. 
There are always feeble-minded persons, ignorant persons, insane per- 
sons, apathetic persons, whom we disregard. There are, therefore, no 
absolute facts, and a universe without men and human discourse would 
be a universe without facts. 

It may be readily granted that this definition, of a fact is no fact 
itself. There are many men who would hold that facts are just events 
jand objects, and will so continue to assert even after they hear this 
definition. The definition is made, however, with the hope that certain 
plite hearers will immediately accept it. They are the audience to whom 
.ihe definition is addressed. 

Psychological facts are events so described that any psychologist 
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will accfipt the description. This description is done perforce by psychol- 
ogists and for the benefit of other psychologists. Laymen will not 
necessarily be familiar with the language itself, or with what should be 
looked at (for, after all, facts are based upon what is seen and heard and 
are not pure inventions of men.) 

Facts are of particular importance to science. Science is founded on 
an interest in fact. By that I do not mean an interest in nature. I mean 
an interest in getting down to a factual basis which starts with descrip- 
tions acceptable to any observer. This agreement must not be limited 
to the members of a cult or the partisans of a cause, or the speakers of a 
language. Sun worshippers may all agree that the Sun God has reap- 
peared with the new day. Members of other cults will not accept his 
divine attributes. But all men in any language will agree that the 
circular disc of light is again present. 

The agreement that is essential to facts can not depend upon skill 
or judgment or taste, unless we have assurance that that skill or judg- 
ment or taste can readily be acquired with training. The expert tea- 
taster, the connoisseur of pictures, the skilled diagnostician does not 
contribute to the advance of science unless he can discover a factual 
basis for his judgments open to all interested men. 

As psychologists we face in the near future the onrush of a torrent 
of new facts. The number of psychologists has doubled in a few years 
and if we read the signs correctly is about to double again in an even 
shorter period. Even the present members of the Association will be 
contributing to the deluge if we can interpret the fact that during one 
year approximately three-fourths of us (three thousand of us in round 
numbers) have changed our addresses and presumably have en- 
countered new persons and problems and new scenes. Industrial psychol- 
ogists, school psychologists, personnel psychologists, clinical psychol- 
ogists, will soon be filling the pages of our journals with statements 
which will, it is to be hoped, include the right proportion of facts. We 
hope to remain on a factual basis. 

But that is not to be taken for granted. A flood of new publications 
is not automatically a flood of new facts. And it may include many facts 
which do not contribute materially to the science of psychology. Col- 
lections of facts are not science. They are the material out of which 
science can grow, but they are only the raw material of science, and 
sometimes they are not even that. 

Psychological facts are events described in psychological terms and 
therefore by and for psychologists. The descriptions which facts require 
have not been lying about waiting to be noticed. They are the result of 
hard work and careful and devoted attention. And their value depends 
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on the insight and good judgment of their collectors. There afe useless 
and misleading facts as well as useful and enlightening facts. There can 
be a great wastage of paper and of human effort in the publication of 
facts. Collections of facts for their own sake are of no more value than 
the collections of old objects. 

Some facts are useful in themselves. Medical knowledge includes 
many facts about the effects of drugs and treatment for which there is 
no rationale, and as psychology extends into new fields it will welcome 
many facts about human behavior that cannot be fitted into any 
theory. The success of applied psychology will depend on the accumula- 
tion of much knowledge of this sort. And knowledge of this sort must 
be taught to students if they are to become practitioners. The open- 
mindedness of the physician toward facts of this sort which have no 
bearing on theories has saved the lives of many of us and is the real 
mark of the physician as distinguished from the scientist. The physician 
is interested in the cure of his patients. His success depends on his 
acquaintance with thousands of medical facts. The discovery of penicil- 
lin may lead to a saving of human life years that will match the loss of 
the fifty million lives in the war just finished. But until many re- 
searchers have patiently collected the relevant scientific facts that 
enable them to make scientific generalizations about how penicillin 
brings about its results, the discovery of its healing effect is not yet a 
contribution to the sciences of medicine and physiology. It is only a 
contribution to the tools of the physician. 

My own personal bias in psychology is toward an understanding of 
learning and habit formation. In the two fields of learning and of 
motivation will be worked out the basic theory that will eventually 
make the science of psychology a much more powerful instrument than 
it now is. When we are able to state the general principles which govern 
human learning we shall have the most important tool needed for the 
prediction and control of human behavior. 

Nothing is more familiar to men than human learning and habit 
formation. Men have liv^d intimately with the phenomenon since long 
before they reached the human state, and they have in myriad ways 
remarked and described if. But a scientific theory of learning has yet 
to be agreed upon by psychologists. Such a theory is essential to prog- 
ress for several reasons. One of these is that unless the beads of fact can 
be strung in order and pattern on the threads of a theory, there is a 
strict limitation upon imparting psychological knowledge to others. 
Theories are mnemonic devices that make science teachable. And 
theories are the basis of working concepts. They enable men to confront 
new facts and deal with them successfully. Furthermore, theories are 
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required to direct the search for relevant facts. It is theories that 
endure, not facts. Events are ephemeral and their descriptions also 
may be ephemeral. It is theory that lasts for years or for generations. 
It is theory rather than fact that leads to new controls over nature and 
events. From theory inferences can be made and new applications 
devised. Facts are likely to be local and temporary. Their applications 
are limited. 

A blacksmith may have collected much skill at his trade, and a few 
facts. He has learned that iron bought from a certain firm has certain 
good qualities. He can, by watching the color, judge the moment for 
making his weld. He knows how to temper his steel and how to draw 
its temper. The science of metallurgy goes in for a very different collec- 
tion of facts. The science of metallurgy lies in making a different col- 
lection of facts. It is not interested in subjective color but in tempera- 
ture. It substitutes chemical analysis for facts about market source. 
Market sources are ephemeral and smiths report colors less reliably 
than thermometers and thermocouples report temperature. 

Like metallurgy, a scientific psychology consists in a new orientation 
toward psychological facts, a weeding out of subjective descriptions and 
an avoidance of descriptions colored by values and prejudices that are 
not universally shared. At the present moment our science is entirely 
too tolerant of such concepts as adjustment, reward and punishment, 
success and failure. These are all strongly flavored by values that are 
non factual. The approval of our own group or cult determines what 
we shall call an adjustment or what we shall call success. 

It is my own conviction that in the field of learning the great ma- 
jority of studies have been collecting unpromising kinds of facts. They 
have collected facts analogous to the blacksmith’s lore concerning how 
long a particular tempering of his iron will wear upon the horse’s hoof, 
how well pleased his patrons are with his wagon tires, how fatigued he 
will be with one method of welding as compared with another. 

The reason for this is that we have allowed ourselves to be too much 
influenced by the desire for results of immediate practical application. 
This has led to the common acceptance by psychologists of a definition 
of learning in terms of practical value. Most psychologists, when they 
use the w^ord learnings mean the acquisition of socially approved modes 
of behavior, improvement in performance, in economy of effort and of 
time in attaining conventional goals. The early writers on learning, 
Thorndike, Lloyd Morgan, Hobhouse, defined learning in terms of 
achievement. The animal learns a task set for him by the experimenter. 
He improves his accomplishment. 

This conception, of course, is in good accord with practical common 
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sense. It is what gets done that is of practical importance,^ not the 
response of the person, but the results of that response. But to use 
practical achievement, goal attainment, success, as the essential cri- 
terion of learning, and to turn our search for facts to the observation of 
success and the conditions under which it is attained is analogous to the 
use of money value by the chemist as his chief descriptive term in 
observing a chemical reaction, or the definition by the physicist of work 
in terms of useful work or valuable work. All the psychologies which are 
written in terms of “least effort" or of goal achievement are by that 
choice rejecting the possibility of developing an objective and scientific 
psychology. They are, of course, following public interest which is 
turned toward securing quick results in training, or toward the abolish- 
ment of obnoxious habits, the acquisition of paying skills. We shall 
never learn how skills are acquired if we confine our attention to 
“improvement" in behavior and use as the criterion of learning the 
elimination of bad behavior and the acquisition of good, or the ac- 
complishment of praiseworthy results. We must understand the 
processes through which behavior is changed, whether for better or for 
worse. A clinical psychologist is properly interested in the “cure" of an 
enuresis. If he is a real psychologist as well as a clinician, he will be 
interested in just what alteration in behavior was brought about; the 
fact that the alteration was acceptable to his patient's family may con- 
tribute to his income, but not to his science. 

The conception of learning in terms of socially valuable outcomes 
of action led to the collection of learning curves which indicated the 
reduction of time and of waste motion with practice. It even led to a 
perversion of Pavlov’s conditioning experiments in which as many as 
1500 pairings of stimuli are recorded along with the resulting change in 
certainty or intensity of response. During that series the phenomenon 
of learning has occurred at each pairing. The massed effect of the 1500 
trials may totally obscure or totally miss what happens at each trial. 
Studies with the maze, the puzzle box, the acquisition of skills, all 
record some end-result, but do not collect facts involving the animal 
itself. 

The literature of menthl tests had over twenty years ago collected 
some ten thousand titles and the number must be at least three times 
that figure by now. For the most part the testers have limited their 
collection of facts to the marks put on paper by persons being tested, 
and to the association of these marks with some criterion. They have 
not examined the behavior of the child taking the test, nor has this 
enormous literature advanced our understanding of what goes on in a 
child who marks the third of four possible choices. In other words, the 
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testing ‘movement has been absorbed in highly useful and practical 
work, but it has not contributed to psychological theory. It has not 
advanced our knowledge of how the child's mind works. 

In the same way studies in learning have been dominated by practi- 
cal considerations and their facts collected center in practical outcomes 
of behavior rather than in the behavior process itself. We must under- 
take to examine the nature of changes in behavior before we shall have 
a proper understanding of success and failure. 

My first suggestion for directing our attention toward facts that will 
lead to the development of good theory applies chiefly to the field of 
learning. It is that we look for facts in the behavior of the organism 
rather than in the operation of a latch, an arrival at a goal, the “learn- 
ing” of a lesson. We should transfer our interest from the goal achieve- 
ment to the behaving organism. It is the muscles of the organism that 
are innervated, and not the lever of the problem box. The machinery 
through which solutions are arrived at is contained within the skin of 
the solver. 

May I illustrate what here is meant. Studies of maze learning have 
kept records of the time and number of errors required on successive 
trials to get the animal to a particular area. Learning curves have been 
plotted and learning assumed to be a direct function of the number of 
trials. Practically no experimenter has taken account of the fact that 
each animal may radically alter its behavior on successive trials, or that 
the alteration may have been evident only between the eleventh and the 
twelfth trials and exhibited no curve at all. The curve is only the 
resultant of many cumulative learnings, which may have included a 
number of “unlcarnings” as well. The picture of learning as a function 
of the number of trials may be totally altered when we examine behavior 
at each choice point separately. 

Dr. George P. Horton and I occupied ourselves two pre-war winters 
in observing and recording some eight hundred escapes of cats from a 
puzzle box. One startling result of an examination of our photographic 
records of the posture of the cat at the moment of release is the discovery 
that a series of escapes often displays a highly routinized pattern and 
stereotyped posture which appears at widely separated points in the 
series. Here is an elaborate series of movements extending over a period 
of many seconds or even minutes which has not disappeared from the 
cat’s repertoire, although it has not been in evidence in the cat’s be 
havior for many trials. It was not unlearned or forgotten, as is proved 
by its accurate reproduction. 

If we iiad contented ourselves with a record of the time required to 
escape, we should have missed the real nature of the learning process. 
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So far as we can judge, improvement in the sense of time reduction 
consisted in the gradual elimination of movement routines that left the 
cat in the box. The successful act itself always appeared suddenly, 
either in the very first trial or in some subsequent trial. It required no 
long series of repetitions for its establishment. 

It has been suggested that it will be profitable to give more attention 
to the behaving organism if we are to understand learning. There is a 
second admonition which might well be taken seriously. This is that 
we may profitably give more attention to stimuli as the occasions for re- 
sponse. No psychologist has seriously challenged the conception that 
the normal occasion for muscular contraction, and hence for all that an 
animal does, is the activation of sensory receptors. There are psycholo- 
gists, however, who believe that the stimulus-response formula has had 
its day. For myself, I do not believe that it has been yet properly ex- 
ploited. It requires that, if we are studying learning, we observe the 
response actually following stimulation. Many recent experimenters 
have, instead, followed Pavlov and observed not the sequence of stim- 
ulus-response but the conjunction of two stimuli like bell and food, or 
buzzer and shock, and have not believed it necessary to notice what ac- 
tual response followed the signal. 

Psychologists who think in terms of punishment and reward have 
almost uniformly neglected to note how the animal at the time re- 
sponded to the punishment or to the reward, and the role this played in 
subsequent behavior. The resulting generalization is inevitably an at- 
tempt to link the intentions of the experimenter (intentions to reward 
or punish) with good or bad behavior on the part of the animal. Pun- 
ishment and reward are, objectively viewed, stimuli acting on the ani- 
mal's sense organs, and their effect must be mediated through the 
animal’s nervous system and appear in muscular contraction or glandu- 
lar secretion. Since levers and loops and mazes are not innervated, the 
operations of these devices are incidental to the actual learning which 
the living animal performs. 

This failure to examine facts in the field of stimulus-response se- 
quence is, of course, a tradition of psychology. Lloyd Morgan, Hob- 
house, Thorndike, responsible for our first careful observations of learn- 
ing, all were interested primarily in success rather than in response, 
and all speak in terms of * ‘confirming results.*^ It has occurred to none 
of them to regard these confirming results as possible stimuli, followed 
by possible response. Hull, who has endeavored to make the concept 
of rew^ard over into something much more objective and immediate, so 
far as I can understand leaves the determination of what it is that will 
serve to confirm or reinforce quite vague. I believe it will be very 
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proiitabie to examine his reinforcements as possible stimuli with close 
attention to their subsequent responses. 

There are many fields in psychology in which the injunction to note 
and formulate facts concerning the stimulus-response sequence might 
well be followed. Freud, who makes associative learning the foundation 
of his whole system, has at no point even asked what it is that is asso- 
ciated, or under what circumstances association is effective. A number 
of psychologists have in recent years insisted in interposing an 0 for 
“organism** between S for “stimulus** and R for “response.** There can 
be no objection to this, provided we make a vigorous effort to determine 
the classes of fact that we agree to include in this O for “organism** and 
are not content to leave it as O. There can be no doubt that what is in- 
tended to be included in this O is often reducible to a dependence of 
response on interoceptive and proprioceptive stimuli, and O is a symbql 
for groups of relevant facts that should be noted and recorded rather 
than given up. 

There is another more legitimate excuse for O. Very properly classed 
as characters of the organism affecting the stimulus-response sequence 
are the facts of past learning which can be known only through the 
record of past behavior. We have as yet no way of noting the brain 
changes that we assume to be the actualities responsible for changed 
response to stimulation. There are also legitimately included under O 
the determiners of behavior sought out by tests, which may be inter- 
preted as behavior samples and assumed to be prognostic of response for 
varying periods of time. O may also include the material being offered 
through the more objective methods of examination and interview, and, 
by inference, the information furnished through the history of the in- 
dividual. 

In all these classes of fact it is of first importance to remember that 
facts are events so described that any competent observer will accept 
the description. We should recall that acceptance within limited groups, 
like the staff of a hospital working under an aggressive leader, or any 
department under an aggressive chief, may exhibit acceptance on 
grounds that do not insure that intelligent and informed outsiders will 
be able to agree to the asserted facts. Stimuli and movements are rela- 
tively objective and the agreement necessary to the establishment of 
fact is relatively easy to obtain. Attitudes and the meaning of behavior 
are less objective and more likely to produce disagreement arnong ob- 
servers. Only on a factual level can the foundations of science be laid. 
Progress toward scientific psychology must be founded on agreed facts 
and public facts. The psychoanalysts* interpretations of dreams and of 
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motivation in general are notably remote from the factual basis that 

must precede the development of a scientific psychology in that field. 
But that a factual basis is not unattainable in a field so remote from 
basic theory as psychotherapy is established by Carl Rogers" recent 
book, Counseling and Psychotherapy, in that he has succeeded in reduc- 
ing his account to descriptions of events which should prove .to be ac- 
ceptable to psychologists of varied interests and varied theoretical back- 
ground. This he has even achieved in a number of his quantitative 
generalizations. He illustrates the type of fact collection which, though 
made with a highly practical aim, may furnish a basis for the theory 
that must eventually be developed in order to give system and order to 
our facts. That theory will extend far beyond psychotherapy into 
many fields of psychology. 

My first suggestion concerning the factual basis for learning theory 
was that we give more attention to the organism itself, and that we 
recognize that such classes of fact as improvement, success and failure, 
reward and punishment, are external and incidental features of learning. 
The mechanism of learning is within the organism. These external fea- 
tures should be examined only in their role as stimuli to sense organs. 

My second suggestion was that the promising factual field for ob- 
servation is the stimulus-response sequence, and that we should meticu- 
lously note such sequences. My third suggestion was that part of what 
some writers insert in that stimulus-response formula, namely the or- 
ganism, can, with diligence, be examined in terms of interoceptive and 
proprioceptive stimuli, often observable and often inferable (as in the 
case of so-called drives like hunger). Much of the rest of O names facts 
of the organism's past history, from which we infer changed tendencies 
to reaction. Such of O as is left over in the form of attitudes we must 
endeavor to place on a basis of public fact and seek foi^ descriptions 
which are acceptable to all psychologists. 

There is a further admonition. This is that we should undertake 
more consistently and thoroughly to note what I may call response- 
stimulus sequences, the stimulus changes following upon the responses 
of the organism. I have .already expressed the opinion that learning 
and motivation represent the two fields most fundamental to an under- 
standing of behavior and thought. Through close attention to stimulus- 
response sequences we may formulate the rules of learning, the circum- 
stances under which such sequences change. Through close attention 
to response-stimulus sequences we may solve many of the problems of 
motivation and the direction of learning. 

It is through observation of the effects of response on stimulation 
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that we may avoid those vague references to drive and motive that have 
done so much to obscure the understanding of behavior. A tense blad- 
der through reflex paths operates to relax a sphincter muscle, but that 
relaxation is inhibited through associative learning by numerous situa- 
tions. When these associative cues are removed or facilitating associa- 
tive cues are added, the act occurs. The original stimulation is removed. 
The incident, save for its effects on future behavior through associative 
learning is for the time being closed. To invent a drive to explain this 
act is unnecessary as soon as we are familiar with the stimulus-response 
antecedents. To allow the disappearance of the restlessness that fol- 
lows sphincter relaxation to force us to speak in terms of a drive that 
has attained its goal and is now satisfied is unnecessary when we observe 
the effects of the response on the new stimulus situation, the R-S se- 
quence. 

Every response alters the stimulus situation of an animal. Some re- 
sponses remove the persistent and insistent stimulus that has been re- 
sponsible for general activation as well as specific action tendencies. 
Such responses have a profound effect on the behavior following and 
on the mode of response that will be acquired by the animal through 
training. 

Other responses leave the stimulus goad in action and the effect is to 
bring new goads into play. In fact the whole direction of behavior is set 
by the effects of responses on stimuli. The advocates of the law of ef- 
fect (Thorndike) or the law of reinforcement (Hull) state the foregoing 
sentence differently. Their version would be; The whole direction of 
behavior is set by the effects of responses. You will recall that the ver- 
sion here suggested is: The whole direction of behavior is set by the ef- 
fects of responses on stimuli. Punishment and reward have no effect on 
behavior as mere rewarders or reinforcers, but only in so far as they 
stimulate new behavior. We learn to do what punishment and rewards 
make us do. We do not necessarily learn to do what was rewarded or 
learn to abstain from what was punished. 

In stimulus- responses there is to be found the key to associative 
learning. In response-stimulus sequences we may discover the motiva- 
tion and direction of behavior. That we learn is insured by S-R. Stim- 
ulus patterns active at the time a response is initiated become inciters 
of that response. Because inciters of rival responses may also be active, 
the response docs not always occur; but what effect such stimulus pat- 
terns contribute is toward the production of the response with which 
they w?re last associated. 

That we learn is insured by the association of a stimulus with a re- 
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sponse. Whether that learning is retained depends on what then fol- 
lows. It depends on the effect of the response on the new stimulus situa- 
tion. May I illustrate this with an anecdote of animal learning. The 
anecdote is not factual in that it describes an event witnessed by only 
one psychologist and he would be too humane to repeat it. But its 
analogue is very familiar in the puzzle box behavior which George Hor- 
ton and I have extensively photographed. The anecdote is this: The 
psychologist in question has a cat which on entering the kitchen before 
mealtime limps with a very noticeable limp. This limp is not observed 
in the cat at other times. Its history is that the cat on one occasion en- 
tering the kitchen at mealtime had its foot pinched in the swinging 
door. The cat made a terrific outcry and continued to limp about and 
put forth noise. After a quick examination to assure himself that no 
bones were broken, the psychologist offered the cat its dinner which had 
been standing ready. Why does the cat persist in limping on later visits 
to the kitchen? 

Horton and I found that every cat we dealt with, between fifty and 
sixty in all, exhibited very similar behavior. When escape from the 
puzzle box followed almost any behavior, colliding with the release, paw- 
ing it, backing into it, jumping to the top of the box and falling on the 
release, lying down and inadvertently rolling to contact with the release, 
heavy odds could be placed that the same movement would be repeated 
soon after the cat was returned to the box. None of these behaviors had 
a learning curve. Each appeared suddenly full blown. In only a few 
cases could anything like an improvement of the successful act be recog- 
nized. 

It is our belief that this characteristic of learning is explainable in 
terms of the effect of the response in question on the stimulus situation. 
Responses which left the cat in the box tended to disappear from its be- 
havior, though in some cases they were very persistent. But the re- 
sponse which opened the escape door was generally preserved. We sug- 
gest that this is explainable through the fact that escape removes the 
cat from the puzzle box but does not allow a liew response to be associ- 
ated with the stimulus situations within the box. The cat has no way to 
forget. R remains faithful to its association with S because unfaithful- 
ness would require that some rival response become associated with S, 
but S is now out of the picture. No new associations can be established 
with an absent stimulus situation. 

It is my contention here that we shall gain much new light on be- 
havior if we devote ourselves more zealously to observing the effects of 
response on stimulation. Every response must have such effects. 
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Through movement an animal changes its view, the sound pattern af- 
fecting its ears, its own pattern of proprioceptive stimulation from 
muscles and joints. In this radical change which results from action lies 
the explanation of the direction of our learning. 

Our own responses not only bring about changes in the external 
world. They furnish cues for our further action. They eliminate, or 
sustain, or produce stimuli to action. And the consequences of this 
elimination, sustaining, or production are far-reaching. 

My dog reaches out and paws my foot as I sit reading. To get my 
attention, the ordinary observer would say. Of course, the dog does 
it to get my attention. This is not a fact, however, but an interpreta- 
tion. Its factual basis is that the dog makes movements or takes a pos- 
ture that was in the past formed and originated by my attention. 
Without this factual basis we are speaking on the level of Little Red 
Riding Hood who is satisfied by the wolf's explanation that his great 
ears are the better to hear her with. 

What would we find the explanation of the dog's gesture if we were to 
follow the rules that are here suggested and note the history of the event 
in terms of stimulus-response sequences and response-stimulus se- 
quences? I reject immediately those softer accounts in terms of in- 
sight. There must have been a first use of the gesture, and I do not for 
one moment believe that dogs come into this world equipped with so 
strange a power for getting results. My own notion of what has hap- 
pened is that first, I have on a number of occasions scratched the dog 
behind the ear. The effect of this on the dog is to interrupt all other 
activities and keep him motionless in place. Common speech uses the 
word enjoyment; but we might try to stick to psychological facts. The 
stimulation of my scratching is an essential element of the dog's re- 
sponse. It is what serves to maintain his quiet pose. When my scratch- 
ing stops, the dog is released, but the sight of me, and the nature of his 
own response to me serve to call out in him a repetition or prolongation 
of his “behavior of being scratched." When scratching stops, he is no 
longer kept quiet by the scratching and is free to move. Whatever move- 
ment takes place will be within the limits of his present stance. I do not 
expect sudden barking or violent action. He has been standing quietly. 
He may move his head. If his movement had attracted my attention 
and brought a resumption of the scratching, I should expect that on 
the next occasion on which the stimulus situation was substantially 
the same, there would be a repetition of the head movement. This re- 
sponse c/f the dog's would not be unlearned, because the stimulus situa- 
tion (waiting unscratched) which had become its cue with one repetition 
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is gone with the recurrence of the scratching. It is more or less of an ac^ 
cident that his original head movement was unrewarded, and that re- 
ward was reserved for a movement of his paw. 

I have used the word ^‘reward.*’ That is not a psychological word. 
My scratching was not effective because it was a reward. It was effec- 
tive because it prevented the dog from unlearning his gesture with his 
paw. If I had, instead, cuffed him hard, the cuff would have undone the 
gesture with the paw as a habit, not because the cuff is a punishment 
but, in psychological terms, because a hard blow would have established 
in the dog a tendency to back away from me and there would have 
been no recurrence of the situation that led to the gesture with the paw. 

Once we have let ourselves in for scratching a dog’s ears, there is no 
natural end to the incident save our own fatigue. The dog’s response, 
his quiet posture permitting the scratching, can be continued indefi- 
nitely and will be interrupted only by eventual fatigue or an adventi- 
tious external event. But there are other actions that are self-terminat- 
ing. Descending a stair cannot go on indefinitely even though the first 
steps have established associative serial connections, because one of the 
peculiarities of staircases is that they have a bottom step and it would 
be only with the help of a miraculous steamshovel and a corps of engi- 
neers that we could be kept provided with steps to descend. If we raise 
an arm, the arm is now raised and the situation radically changed. The 
arm can not be again raised until it has first been lowered. 

This introduces a new admonition for our fact-collecting. Not only 
should we note S-R sequences. We should further note that any R in 
progress, whether that response is active movement or merely the main- 
tenance of a posture, sets strict limits on what can next be done. At 
any moment there are severe restrictions on the behavior possible to 
elicit, no matter what new stimuli are offered. 

We recall in this connection the work of Magnus on postural reflexes. 
When a cat has been decerebrated and is stood upon a surface, slight 
manipulation of its head can result in alterations of its whole muscular 
set. If the head is turned slightly to the right, the right fore-leg is 
flexed, the left extended, and the whole posture is made an obvious set 
for moving to the right. Older members of the profession can recall 
taking advantage of this postural adjustment by using leathern straps 
attached to the head of one of the larger animals in order to induce 
locomotion toward the side on which the rein was pulled. 

It is probable that an intact cat in the posture appropriate to taking 
off toward the right, can not be directly stimulated to take off toward the 
left. The original posture must be first relinquished and a second taken 
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on. A whole field here calls for research. What movements are possible 
from a given stance? What responses are elici table when a person is 
maintaining a given attitude? I would here include not only obvious 
physical attitudes but the more covert states which limit behavior. We 
must not give up the investigation of attention and readiness. 

What are the effects on a going action of sudden irrelevant stimuli? 
I confess that I do not even know the answer to so simple a problem as 
what a diner in a restaurant, just raising his cup of coffee to his lips, 
will do if a police whistle is suddenly shrilled just behind his head. Will 
the movement in process be suddenly energized and the cup thrown 
over his shoulder, or will the cup be dropped from his fingers? We should 
know enough about the rudiments of behavior to answer such questions 
without waiting for some drunken guest to conduct the experiment for 
us. 

Certain recent experiments in conditioning applied a signal under 
circumstances which allowed the animal either to have the leg flexed or 
to have it extended at the time the signal was given. There is small won- 
der that the results were ambiguous, since response was bound to be 
ambiguous. Extension is impossible when the leg is already extended. 
Flexion can occur as an active response only when the leg is not already 
flexed. This reminds us also of the original admonition to note the facts 
of stimulus-response sequence. When a signal is alternately or at ran- 
dom presented during extension and flexion or during running and dur- 
ing cowering, but no record is kept of the response following the signal, 
we should not expect to find any definite generalization in our returns. 

Horton and I had a very considerable amount of fact-trouble in 
making our observations of our cats. Though we made notes and in a 
number of cases a motion picture record, there was often doubt whether 
or not a sequence of movements of the cat in the box could be reported 
as substantially the same as a previous sequence. The statement just 
made that the major determiner of the animal's actions is the present 
state of action or rest is an interpretation rather than a fact. It is an 
interpretation to which we found ourselves compelled; but it is not an 
interpretation to which we could be sure other psychologists would be 
forced. A large part of the time we could tell at any moment what this 
particular cat would do next. Our ability to do so was based on having 
seen it execute the same routine in a previous trial or earlier in the cur- 
rent trial. Having started any former routine, we could predict its 
continuance. It is possible that this prediction should have been under- 
taken, to compare with prediction on any other basis. This seemed, 
however, a bit absurd, since we knew no other basis on which to base 
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predictions. The behavior of other cats allowed such prediction only in 
the most general terms, and did not apply to specific movement series. 

The response-stimulus sequence to which I have been referring in- 
cludes, of course, the familiar concept of “set.” What I am urging is 
that the development of basic psychological theory demands an exten- 
sion of our collection of facts in this class. We should extend radically 
our knowledge of sets and their consequences, and we should do that by 
observing what responses are elicitable from a given stance or set and 
what are not. Social psychologists are collecting facts about attitudes 
and their patterns, particularly those organized about words. This must 
eventually be reinforced by further theory of the elementary behaviors 
out of which attitudes are made. The watchful therapist is full aware 
of attitudes in his patient. The skilled mental tester learns to direct a 
child's behavior into attitudes that permit testing. We must eventually 
know more of the facts of these attitudes. What is the factual descrip- 
tion of negativism or of resistance? To what extent are such attitudes 
general in their effects and to what extent specific? About what cues 
are they organized? What occurs when an attitude of resistance changes 
to one of acceptance? How do prevailing attitudes control attention 
and perception and learning? In other words, what effect does our own 
behavior have on our own behavior? 

I am reminded here of an extremely interesting paper of Heid- 
breder's on cognition recently published. In that she reports the results 
of some experiments which establish a certain hierarchy among con- 
cepts. Confronted with three sets of material in one of which the cogni- 
tion is of an object, in another of a common form, and in the third, of 
a common number, the three are arrived at by subjects uniformly in 
that order — object, form, number. My own interest would be in the 
process through which cognition is attained. That process probably in- 
cludes “trial and error” naming, and is not hopelessly un-get-at-able. 
My point is that cognition is not a mystery which involves a hidden 
thought process and a mental concept for which eventually a word is 
found, but is a case in which the subject uses his verbal repertoire in 
trial and error fashion untjl the word that “works” is hit upon. Proper 
search may discover that it is the careful noting of stimulus-response 
sequences that will furnish the factual basis for a theory of cognition. 
The cognition of objects may prove to be more ready than cognition of 
form or number because physical objects have a way of offering stable 
and recurrent, dependable, patterns of stimuli. Cognition of number 
J^ay prove to be dependent on the initiation of counting which maybe 
injected early in the process by an associative hint, or wait until trials 
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of object names prove futile. Concepts will turn out to be language in 
use, and use will turn out to follow the changes of learning by the same 
laws that will be worked out for action. 

Up to this point a number of suggestions have been made for the 
direction of our search for psychological facts, for the ultimate purpose 
of understanding learning. One of these was that we have tended to 
neglect the behaving organism and to give undue attention to the ex- 
ternal results of movement on the outer world. A second was that we 
would do well to recall that stimuli are the normal occasions for all re- 
sponse. A third was that we should note carefully the sequence of stim- 
ulus and response if we hope to get at the basic principles of associative 
learning. A fourth was that our interest in the role played by the organ- 
ism in determining the response should be responsible for stern efforts 
to develop an objective, factual basis for our descriptions of the states 
of the organism that enter into the determination of behavior. A fifth 
suggestion was that closer attention to the response-stimulus sequence 
would be profitable in explaining motivation and the direction of learn- 
ing. 

My final concern is harder to name than these. I am in entire sym- 
pathy with the belief that quantitative treatment is to be aimed at in 
all scientific fact gathering. Number is the chief tool of science. But 
in our zeal to be scientific, I am convinced that we have been led into 
certain lines of experiment in the field of learning because these lines 
promised at least to yield numerical comparisons, curves. Because re- 
peated trials can be given a series of ordinal numbers we have too read- 
ily fallen into the practice of treating the number of trials as a quantity, 
the more trials the better. We have been led to neglect what I am con- 
vinced is the central problem of learning, namely, what change occurs in 
behavior as the result of a single action. 

In the laboratory we glory in experiments with fifty to fifteen hun- 
dred repetitions and their resulting curves. In nature these repetitions, 
as exactly duplicated as possible, simply do not occur. But learning 
does occur. The experimental results with a long series of repetitions 
have all the desirable characteristics of scientific fact. Their numerical 
analysis can be made by agreed methods. We must ail recognize a mean 
and a standard deviation, or a difference and the standard deviation of 
a difference. And we are fairly agreed on the inferences that can be 
drawn from such analysis. 

In the field of learning this very commendable effort to be scientific 
has led us toward studies of success, the trend of errors with repetition. 
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the reduction of time with practice. But it is a characteristic of a score 
of total errors (in a maze, for instance) to omit examination of the suc- 
cessive changes that constitute learning. Indefinite amounts of unob- 
served learning may enter into our final result. Before our criterion of 
success may be reached, learning, unlearning, relearning may have oc- 
curred over and over again. 

We expect our friends to remember an engagement after one notice. 
We expect clinic patients to be different at each interview because of the 
last. We expect one quarrel to change attitudes. We expect one read- 
ing of a paper before this Association to leave some auditors with an 
impression. One spoiled egg may leave us for a time cautious. Once a 
rat has visited our grain sack we can plan on its return. Terror is called 
out once in the bird dog by the report of his master’s gun, and the dog 
is now gun-shy. 

But in the laboratory we assume that the response fixed in fifty 
trials was one fiftieth fixed in each trial. 

Repetition has its place in learning, but repetition is effective only 
in those complicated instances in which what is learned is not a response 
to a stimulus, but a whole repertoire of responses to a large variety of 
stimuli. We have learned to achieve some result by means which vary 
according to the circumstances. Learning skills takes time and practice 
and furnishes beautiful learning curves and admirable data for statisti- 
cal analysis. This is because they involve many and complicated learn- 
ings. It is here being suggested that the development of a scientific 
psychology requires that we investigate learning in its simplest forms. 
What happens as the result of one pairing of a stimulus pattern with a 
response that alters the previous effect of that pattern? 

No group of psychologists has done more toward investigating this 
phenomenon in its elemental form than the Yale group under the in- 
spiration of Clark Hull. With that work I have only one quarrel. This 
is that they have not examined adequately the R-S sequences which I 
have mentioned. Hull’s theory, which has dominated the collection of 
facts in the Vale laboratory, is the theory of reinforcement, not a 
straight associationism. It^assumes that an association is formed, or is 
not formed by virtue of a subsequent reinforcement or reward which 
somehow works upon traces of the S-R event and confirms or destroys 
the associative connection. 

This theory is in line with the great tradition of the psychology of 
learning. Thorndike in his Animal Intelligence, C. Lloyd Morgan in 
his book of the same title, and Hobhouse in his Mini in Evolution all 
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speak in terms of a confirming reaction, which determines whether 
or not the association will be made. 

No one questions the effectiveness of reward and punishment, or the 
effect of after-effects of a reaction on learning. But this statement of 
learning theory has led to an entire neglect of the observation of R-S. 
The confirmation or reward or punishment is supposed to have its ef- 
fect by virtue of simply being confirmation or reward or punishment, 
not by virtue of the effect which it has on the stimulus situation and 
therefore on subsequent behavior and the opportunities for further 
learning. There is excellent reason for believing that both reward and 
punishment are effective by virtue of what they make the animal do, 
not simply by virtue of their own nature. Adherence to the theories of 
confirmation or reinforcement has led to quantitative results, it is true. 
It is highly probable that close examination of the action caused by 
punishment and the action caused by reward will discover that the 
learning which takes place can be adequately described in terms of the 
new associations set up by the new action. Reward, as Thorndike has 
remarked, tends to leave the animal doing the same thing in the same 
situation, — eating while food is present. Punishment induces the ani- 
mal to do something different in the same situation. A theory of asso- 
ciative learning in its straight form without appeal to after-effects 
would lead us to predict in these instances what happens. The animal 
does not unlearn its tendency to do what it previously did if rewarded 
because nothing has happened to establish rival responses to the situa- 
tion. It does not learn not to cat when the food is finally presented al- 
though it does eventually desist, because either the food or the inner 
hunger is now absent and cannot be re-conditioned in their absence. 

Culler’s laboratory, like Hull’s, has led in the investigation of rele- 
vant facts on learning in its elemental form. Some of that work I should 
like to see repeated with closer adherence to the S-R prescription. 
Stimuli are applied without observing what the animal’s actual next be- 
havior is. In a number of instances two rival responses, flexion and ex- 
tension, take place following the signal. That the result of this mixed 
practice turns out to be not a straight exemplification of association is 
not to be wondered at. 

In general the experimenters who work [with what has come to be 
called instrumental conditioningValso^fail to observe the maxim to 
observe the response following the stimulus. It would seem obvious 
that an investigation of association would require in the first place 
that the stimulus and the following response be at least made a matter 
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of record, but experiments in instrumental conditioning seldom record 
what the animal was doing when the signal was given. The returns 
therefore throw no light on association, but only on the effects of re- 
ward on subsequent response to the signal. In these experiments not 
only does the S-R fail to be observed, but also the sequence, R-S'. 
None of the experimenters is interested in the immediate behavioral 
consequences of reward, but only in the remote effects of reward upon 
a previous stimulus. 

May I here recall the initial theme of this paper. It has been con- 
cerned with the future development of psychology as a science and par- 
ticularly with the possible effects of a sudden increase in the numbers 
of psychologists and a sudden enormous extension of the application of 
psychology to practical affairs. None of us doubts that human living 
will be improved by that extension. Most of us would accept that im- 
provement as the final goal and justification of all human science. But 
we must remember that the sciences have developed through an objec- 
tive detachment from immediate profit, and that, in the overwhelming 
majority of instances, steps forward in scientific theory have been inde- 
pendent of practical application. 

The hope that is here being expressed is that the new psychologists 
will in general not allow themselves to become mere technicians using 
psychological methods and techniques for the accomplishment of prac- 
tical ends, that in the training of the new generation of psychologists 
we take care to cultivate an interest in theory as well as practice. We 
are entering a period of increased usefulness. It is to be hoped that it 
will not be a period in which theory stands still. Our factual information 
is bound to increase at a greatly accelerated rate. For that increase to 
result in the advance of psychology as a science two things are necessary. 
One is that theory be continuously produced and continuously amended 
and continuously used to guide the collection of fact. The other is that 
we remember to conform to the rules that have been responsible for the 
remarkable achievement of the scientific tradition, the use of objective 
evidence which means a basis in facts open to- the observation of all 
who are interested and desccibed in public terms that must be accepted 
by other scientists. These requirements may bear heavily on many cur- 
rent movements in psychology, in which recognition of events is claimed 
to be an art not communicable by ordinary means, open only to the 
inner members of a cult and closed to outsiders. Facts may accumulate 
without theory ; but they will prove to be unstable and of little profit in 
the end.- Theories may flourish if their basis lies not in scientific fact 
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but in opinions and interpretations acceptable only to the members of 
a limited faction; but they will be bad theories. Schools flourish only 
when theories are not carried back to public facts. Unless psychologists 
maintain an interest in general theory the fields of psychology will in- 
creasingly become independent collections of undigested information. 
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Introduction 

For more than a decade contemporary psychiatry has been employ- 
ing extensively as a treatment for psychiatric illness the procedure of 
artificially inducing comatose and convulsive states in mental patients. 

It is not at all difficult to find historical antecedents in the care and 
management of the psychiatric patient which superficially, at least, 
appear to be like the current methods of shock therapy in that a sudden 
and sometimes quite aggressive change is effected in the subject’s en- 
vironment leading to a precipitous alteration in the patient’s physical 
and psychophysiological state. The content of these methods varied 
with the change in cultural thought and are, indeed, as adequate symp- 
tomatic reflections of the history of psychologic thinking as, for exam- 
ple, are the historical changes in the interpretation of dreams or the 
evolutionary variations in the concept of the soul. 

Occipital branding was perhaps one of the earliest of these methods, and 
the variations of the well-known water cures wherein the “frantic person was 
placed with his back to the water without being permitted to know what was 
going to be done” and who was then “knocked backwards into the water by a 
violent blow on the chest and tumbled about in a most unmerciful manner 
until fatigue had subdued the rage (159)’* have a long history of therapeutic 
existence. Even in the time of Herman Boerhaave (1668-1738) “ducking” was 
a common psychiatric procedure as it was, too, in the remedial repertoire of 
Benjamin Rush, the father of American psychiatry. Additionally, Boerhaave 
recommended a special twirling stool on which by spinning a patient until he 
became unconscious his brain could be “rearranged” and the patient made 
normal (159). 

The Roman, Cornelius Celsus (14 a.d.), in his De re medica wrote that a 
certain amount of physical coercion and a measure of gentle cruelty and the 
more refined tortures of inducing fear might bring the mental patient to his 
senses. Phillipe Pinel, the famous psychiatric emancipator, also believed in 
fright as an effective psychiatric remedy, and Johann Reil (1759-1813), while 
prescribing throwing patients into water and subjecting them to the firing of 
cannon, also was a protagonist of other forms of non-in jurious torture (159). 

Certain followers of Asclepiades, like Themison, approved of large doses of 
alcohol, antedating by two thousand years the therapeutic efforts of Kantoro- 
vitch, Constantinovitch (75), Berrington (12) and others who have relatively 
recently attempted to modify schizophrenic stupor by brandy taken orally or 
by alcohol given intravenously. These students of Asclepiades were also con- 
vinced of the merits of whipping the mental patient as were also the Anglo- 
Saxon priests described in Bald's Leechbook who believed that if one took the 
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skin of a porpoise and worked it into a whip and then would *'swing the man 
therewith, soon he will be well. Amen.*’ 

Blood-letting, perhaps reaching its height as applied to psychiatric treat- 
ment under the Parisian, £duoard Francois Marie Bosquillon (1744-1816), was 
also widely practised over a considerable period of time and must undoubtedly 
have produced in some patients surgical shock if nothing else. 

Following hard upon the heels of the creation of the Leyden jar, the first 
electrical condenser, by Pieter van Musschenbroek in 1746, Richard Lovett 
10 years later claimed success in treating mental diseases with electricity. 
Arndt (7) in 1870 reported good results in depressions after electrical shocking, 
and other investigators, notably Huff (67), Haynes (67), and Allbutt (4), 
favorably inclined toward the psychiatric therapeutic use of electric current. 

Cyanide, which among other effects depresses brain metabolism, had been 
used by Loevenhart (65) as recently as 20 years ago to obtain remissions in 
mental patients. Much earlier, hellebore had been extensively used, leading to 
the term “helleborism” as meaning a form of psychiatric treatment. 

Hippocrates had in his times observed that organic disease supervening 
upon mental illness sometimes caused an abatement of psychotic symptoms, 
and remissions in psychiatric patients after surgical anesthesia, aborted suicide, 
fractures, and intercurrent disease are known to every psychiatrist who has had 
any extensive institutional experience. 

The enumeration of these historical forerunners of contemporary 
shock treatment, however, does not assume any necessary continuity 
or similarity of motivation in the use of shock therapy, unless one may 
speculate upon the possible common presence of the less consciously 
defined motivation of aggression against the patient evoked in the 
physician by the inadequately met challenge of the etiological and 
therapeutic demands of the patient’s illness, and one may perhaps also 
tentatively consider the incidental motivational importance of the 
residual in the therapist’s personality of child-attitudes of the expect- 
ancy of the sudden, miraculous, magic resolution of puzzling barriers. 

Shock therapy, whatever the ultimate evaluation of its therapeutic 
efficacy or desirability may be, provides extraordinary research oppor- 
tunity under fairly controlled conditions for the investigator interested 
in either the psychologic or the neurophysiologic descriptions and inter- 
pretations of the multitude of happenings associated both with the 
active process of treatment and with the immediate and remote results 
of these therapeutic shock procedures. 

For the psychologist, the controllable creation of convulsions in the 
human subject makes possible, for example, a precisely defined investi- 
gation of personality reintegration after convulsive dissolution. Many 
of the characteristics, both theoretical and actual, of psychological 
ontogenesis may be studied in this reintegration of the personality. 
Disturbances in perceptual organisation and the relation of such dis- 
turbances to general cognitive functioning, the description of the de- 
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terminants of perception, the relation of memory-content to problem- 
solving, learning, and logical thinking, the relation between emotion 
and memory and the relations between emotion and various personality 
dysfunctions, and the investigation of general problems of learning and 
habit-structure are but a few of the subjects which may be studied, 
from perhaps new aspects, by the observation of post-convulsive be- 
havior. 

The psychopathologist will be interested in what he may recognize, 
as did Sakel, as “activated symptoms" of a psychosis, or he may see 
symptoms suggestive of failure of inhibition and control and of the re- 
lease of ontogenetically earlier modalities of behavior. So, too, the 
relationship of the changes in the person which are produced by con- 
vulsions to the characteristics of the pre-treatment psychotic personal- 
ity merit closer scrutiny than has as yet been reported. The field of 
prognostic evaluation and psychological aid in the selection of can- 
didates for successful therapeutic issue has been more widely studied. 
Systematic psychologic comparisons among convulsive and comatose re- 
actions produced by the various precipitating agents and the examina- 
tion of the similarity and difference of the after-effects of these induced 
convulsions both with reference to the varieties of artificial induction 
and to the immediate and remote sequelae of endogenous convulsions 
have not yet been adequately accomplished. Some of such comparative 
evaluation, however, is now much less important because of the empiri- 
cal selection of electrically induced convulsions as the currently pre- 
ferred treatment in most cases, with insulin shock now considered as 
possibly primarily valuable in certain kinds of schizophrenia. Psycholog- 
ic assessment of what has been the nature of the personality dynamics 
in the cases of remission and recovery needs to be increasingly critical 
as a basis for the rationalisation of the therapy. The psychologic as well 
as the neurophysiologic interpretations of the therapeutic efficacy and 
mechanism of shock therapy must be substantiated progressively by 
objective research. 

Neurophysiology is interested in the happenings which afford an 
opportunity to adduce facts confirming or depreciating theories of 
nervous system function and of the neurophysiologic correlates of be- 
havior. Artificially induced convulsive reactions exhibit various degrees 
of neurophysiologic dysfunction and these shock-induced syndromes ap- 
pear to be reversible. Critical attention directed to these conditions 
provides neurophysiology with a pathological analysis of the nervous 
system, as well as affords an opportunity to attend more closely to the 
neurophysiology of mental disease. 

Since the literature directly and indirectly relating to shock ther- 
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apy has becoirife so extensive, the formulation of future psychologic re- 
search and theory may be aided by a summary and critical review of 
pertinent psychologic description and interpretation evolving out of the 
widespread application in psychiatry of these recent therapeutic tech- 
nics. 

Shock Therapy 

Induction Methods and Characteristic Reactions 

Insulin. In 1928 Manfred Sakel (124) began to treat abnormal men- 
tal conditions by insulin-produced hypoglycemia after his experience in 
treating drug addicts for abstinence symptoms with insulin in order to 
depress the “dominatingly increased*’ activity of the sympathetic nerv- 
ous system which he considered was associated with the withdrawal ef- 
fects. In 1933 in the University of Vienna Psychiatric Clinic he began ta 
treat schizophrenics by induced hypoglycemic reactions. However, it 
had already been discovered in Oslo (83) quite by accident that a 
mental patient could be “cured” by the induction of hypoglycemia, and 
Day and River had independently made the same discovery here in 
America. An Hungarian doctor also claimed to be using insulin in the 
treatment of mental conditions a few years before Sakel initiated his 
treatment. 

Sakel’s original procedure of treatment has undergone many modi- 
fications, but, in general, sufficient insulin is given to induce first quiet- 
ness, then drowsiness, sleep, and finally coma. A comatose reaction has 
ensued in an individual with as little as five units of insulin and has not 
followed in another patient even after 400 units of insulin. The vari- 
ability of individual beha\ior in the hypoglycemic reactions is consider- 
able. Some patients evidence great restlessness and even excitement 
before going into coma, while others may attain the comatose state 
without much expressive activity at all. From the physiological aspect, 
sweating, increase in pulse rate, and muscular movements may be seen 
in the pre-coma. Isolated twitching and jerks in the body musculature 
may then merge into a generalised convulsion, although usually the 
coma is interrupted before a convulsion eventuates. 

Sakel writes of the cerebral functions gradually disappearing, level 
by level, as the hypoglycemia intensifies and then returning as the pa- 
tient awakens within the quarter hour after termination of the hypo- 
glycemic reaction by the administration of glucose. Frostig (42) has 
further detailed the neurologic description during five hours of the hypo- 
glycemic process. In the recovery period confusion, psychomotor ex- 
citement, dysphasia and other disturbances are seen. Most patients are 
amnestic for the period of hypoglycemia and where the coma has been 
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prolonged or where a convulsion has occurred retrograde amnesia is ob- 
served. In this latter respect it is interesting to know that Sakel in his 
early studies attached a certain therapeutic importance to epileptic and 
epileptoid motor reactions and stressed the necessity of long treatment 
not only for the modification of secondary symptoms but also for the 
control of the symptoms of the ‘‘activated psychosis" which Sakel 
thought to occur at a stage in the treatment when the patient showed no 
signs of his psychosis except during the hypoglycemic state. However, 
as Kalinowsky (70) particularly points out, the acute psychotic picture 
which Sakel called the activation of the psychosis is undoubtedly a 
transient organic reaction to the treatment and very probably would 
occur in a non-psychotic subject as it does, indeed, in psychoneurotic 
subjects. 

Early in the development of insulin shock therapy the Viennese 
school stressed the importance of a period of euphoria, accessibility, 
and inhibitory release in the early hypoglycemic period, although the 
occurrence of such behavior is by no means a constant feature of insulin 
shock induction. Psychotherapy was used during the existence of this 
condition as well as in the post-coma phase of recovery. Such transitory 
insulin-induced accessibility may be compared to current narcoanalytic 
technics using soluble barbiturates. Bleckwenn (15) had in 1930 used 
sodium amytal in the treatment of the psychoses. In connection with 
the mention of the narcoanalytic technic it is interesting to observe that 
Thomas (143) and others have reported that mute and inaccessible 
psychotic patients who respond to intravenous sodium amytal will also 
respond to shock therapy and Thomas has proposed the use of sodium 
amytal as a prognostic aid in predicting the outcome of convulsive 
therapy. 

There is no space in this paper to discuss the possible neurophysio- 
logical mechanisms of the psychosis-modifying effect of insulin treat- 
ment. Sakel (125) thought that adrenalin hyperactivity produced a 
lowering of neural thresholds in the cerebrum, thus reviving the phylo- 
genetically oldest and infantile constellations and pathways among 
the neurones. Insulin neutralises the effects of adrenalin and was thus 
considered to reverse the whole process, the resulting hypoglycemia pro- 
ducing a vagotonia. Gellhorn (44) believes that insulin leads to an ex- 
citation of the sympathetic-adrenal apparatus through hypoglycemia of 
the brain, and Heilbrunn’s (58) experiments are in favor of the hypothe- 
sis that insulin creates a sympatheticotonia, although Hadorn (56) has 
indicated that such effects may depend upon the dosage and he has 
demonstrated that small doses of insulin stimulate the vagus while 
large doses cause a primary secretion of adrenalin. Pfister (58) thought 
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with Sakel that hypoglycemia damps the sympathetic nervous system. 

However, as Riess and Berman (117) have indicated, there is no 
evidence of adrenalin overproductivity in schizophrenia or in other 
psychotic states. Insulin, in fact, increases adrenalin in the hypogly- 
cemic phase. Parker (108), too, has shown that the autonomic functions 
during the hypoglycemic states are marked by oscillation between sym- 
pathetic and parasympathetic tonus, due to the increased activity of 
one system rather than a diminished tone of the other. Moreover, this 
investigator found that recovery occurred both in cases where chiefly 
sympathetic and where primarily parasympathetic excitation was evi- 
dent. Fortuyn (37) also observed both sympathetic and parasympathetic 
activity in the hypoglycemic state. 

Other explanations have stressed, like that of Georgi (45), the de- 
crease of permeability of neuronal membranes in the psychoses facili-- 
tating the retention of toxic products, a condition which is then altered 
by the insulin effects. Assumptions have been made, like that of Demole 
(28), which are based upon a belief that insulin tends to correct disturb- 
ances of neuronal oxidation which results in an accumulation of sub- 
oxidised products of metabolism which produce schizophrenic symptoms 
as a form of cerebral toxemia. Unfortunately, these descriptions point 
to an insulin effect upon an assumed but unproven physiopathology of 
the psychoses. 

MetrazoL This is a synthetic compound clinically related to camphor. 
In this substance the six carbon ketone ring of camphor is condensed 
to form pentamethylenctetrazol (53). Although Paracelsus had used 
camphor as early as the 16th century to produce convulsions and al- 
though Oliver in 1781 treated a manic psychosis by camphor-induced 
convulsions, cardiazol, as the synthetic compound is called in England 
and on the continent, was synthesized and first investigated pharmaco- 
logically by Schmidt, Hildebrandt, and Krehl (127), and it was soon 
being used as a stimulant in circulatory and respiratory collapse. In 
1926 Blumc (16) found it possible with large doses to produce convul- 
sions in animals. 

Meanwhile, a few years before the use of psychiatric convulsive ther- 
apy was begun, Nyiro and Jablonsky had observed in 1929 that cases 
of epilepsy in which schizophrenic features were prominent recovered 
rapidly from the schizophrenia when fits were frequent, and in 1930 
Muller (97) described two cases in which schizophrenic illness recovered 
quickly after the appearance of spontaneous epileptic fits. Glaus (49) 
had also reoorted a year later that the combination of schizophrenia and 
epilepsy was rare, although such an observation was certainly contra- 
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dictory to textbook descriptions of the well-known continental psy- 
chiatrists of the time, like Kraepelin and Bleuler, who found nothing 
extraordinary about the diagnosis of schizophrenia with epilepsy. How- 
ever, in contrast, it is interesting to note that Steiner and Strauss, writ- 
ing in Bumke’s well-known Handbuchder Geisteskrankheiten, would ques- 
tion the correctness of the diagnosis of schizophrenia in any case with 
convulsive attacks. 

Glaus investigated 6,000 schizophrenes and found that only eight 
had had fits at any time in their lives and, more interestingly, he dis- 
covered that four of these eight patients recovered shortly after the fits 
appeared. Other scattered reports in the literature also indicated a pos- 
sible antagonism between schizophrenic catatonia, at least, and epi- 
lepsy. 

In 1934 von Meduna, the director of a mental hospital in Budapest, 
who had been experimenting with camphor-induced convulsions in ani- 
mals, began to produce convulsions in chronic schizophrenics by the 
intramuscular injection of 25% camphor in olive oil, being motivated 
largely by the consideration of the possible salutary effect of epileptic 
seizures on the schizophrenic illness. In addition to what he considered 
the incompatible occurrence of epilepsy and schizophrenia in the same 
individual, Meduna also sought to establish that epileptics belong 
chiefly to Kretschmer's athletic type and that the neuroglia in epileptics 
is hyperplastic in contrast to the supposed neuroglial aplasia of the 
schizophrenic. The carbohydrate metabolism was also evidenced to be 
slowed in schizophrenia and accelerated in epilepsy. 

Against such evidence of the antagonism of schizophrenia and epi- 
lepsy there have been several observations published in recent years. 
The paper of Yde, Lohse, and Faurbye (158), for example, recounted 
the data of 715 cases of schizophrenia 20 of whom had had convulsive 
attacks, and epilepsy was established in five of these cases, which is 
about twice the incidence expected by chance. Hoch (66) has also in- 
dicated a positive tendency for epilepsy and schizophrenia to be asso- 
ciated. Moreover, the standard classification of mental disease, in this 
country at least, may lead to serious statistical collection errors in that 
schizophrenic symptoms occurring with epilepsy are very likely to bear 
the diagnostic label “Psychosis due to convulsive disorder (epilepsy)*’ 
and so completely escape the attention of the searcher who looks only 
for schizophrenia with an accompanying epilepsy. As this standard 
nomenclature indicates, the organic bias of contemporary institutional 
psychiatry will probably put the major diagnostic emphasis upon the 
epilepsy. 
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Mo^t metrazol convulsions are now induced by the intravenous in- 
jection of the drug so that the usual interval between the commence- 
ment of the injection and the initial signs of convulsion is about ten 
seconds. The reaction may be either subconvulsive or generalised and 
is usually evidenced by sudden pallor, a short cough, muscular twitch- 
ing, and a terrified expression, and sometimes by a cry or scream. Un- 
consciousness after the fit lasts for variable intervals and the post- 
convulsive recovery is characterised by the same sequence of events as 
occurs after any generalised convulsion. 

Intramuscular injections of triazol produce a slower prodromal con- 
vulsive picture than intravenous metrazol and considerable expressive 
activity may be seen in the patient before the convulsion is precipi- 
tated. 

Multiple convulsions may be produced by large doses of cardiazol. 

Denyssen and Watterson (29) believe that the convulsion following 
cardiazol injections is due to sudden vasoconstriction, but Georgi (46) 
thinks that the pathogenic factor in the initiation of convulsion is not 
to be found in a primary vasospasm but in the ionic change at the cell 
membranes, and he has suggestively demonstrated that the ionic 
changes occur only when the cardiazol injection is sufficient to produce 
a generalised convulsion. This is possibly significant in that it is gener- 
ally accepted that subconvulsive reactions are therapeutically much less 
effective than are total convulsive reactions, although the contention of 
Polatin, Spotnitz, and Wiesel (116) that in the insulin procedure hypo- 
glycemia alone without coma or convulsion is adequate to bring about 
remissions must be kept in mind. This observation is also substantiated 
by Krasnouchkin and Hanlarian (82) who obtained very favorable 
therapeutic results both with and without provoking reactions of coma. 
Hill (64), too, has claimed success by treating schizophrenia and other 
mental diseases with small doses of insulin (5-10 units) and histamine. 
Hill proceeds on the assumption that there exists in psychiatric disease 
a pathologic barrier in the endothelial walls of the capillaries against the 
normal interchange of plasma. Histamine, particularly, is considered 
to affect this pathologic barrier so that capillary walls are made more 
permeable to the osmotic cell-nutrient interchange. 

Ammonium Chloride. Bertolani (118) in 1938 began producing 
epileptiform seizures by the intravenous injection of a solution of am- 
monium chloride. This method has been very little used in the United 
States, but it enjoyed a temporary popularity in India in the hands of 
Rizvi (118) and on the continent and in England it had a number of 
protagonists, including Mazza (118), Dax (26), Martinez and De la 
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Vega (8S>), and others. The reaction usually consists of a pre-conVulsive 
period lasting 10-20 seconds in which the face becomes fludied, and in 
which hyperpnea, pupillary dilatation and loss of reaction to light, and 
absence of corneal reflex are seen. The convulsive phase proper is usu- 
ally mild but may be typically epileptic, although ordinarily only mus- 
cular twitching and extension and flexion of the legs and arms are 
observed. No cyanosis, incontinence, or biting of tongue is commonly 
present. Afterward the patient may be disoriented. It is assumed by 
Rizvi that the convulsive reaction is produced not by alteration of the 
blood pH but by the irritant effect of the ammonium chloride on the 
blood vessels, causing vasoconstriction. 

Coramine. 12 to 15 intravenous injections of 10-25 cc. of coramine 
(Nikethamide) solution, a drug which like mctrazol is a central nervous 
system stimulant, have been used as a course of treatment for psychi- 
atric syndromes. Following injection, an initial apnea results followed 
by hyperpnea, accelerated heart rate, increased blood pressure, profuse 
sweating, tremors, and nystagmoid ocular movements. Consciousness 
is retained, but confusion, disturbance of attention, and general "cloud- 
ing" of mental function may last for about 10 minutes. Anxiety and 
fear expressions are seen. Skorodin (131) claims 19% remission ob- 
tained in selected cases. 

Electric Shock. The currently most widely used shock procedure 
originated from the use of electricity in producing convulsions in dogs. 
Leduc had in 1900 produced electrical sleep in animals, and Robino- 
vitch (119), working in Leduc’s laboratory, found that her electrical 
stimulation of the dog’s brain sometimes produced epileptiform con- 
vulsions, an occurrence which five years later was confirmed by Weiss 
(1S2). In 1929 Viale (145) was inducing epileptic seizures in dogs by 
placing electrodes in the mouth and rectum and passing an electrical 
current. Cerletti (21), who had also had extensive experience in the 
production of electrically induced convulsions in animals, inaugurated 
in 1938 with his collaborator, Bini, the electroshock treatment of the 
human psychotic subject. 

Four hundred to 500 milliamperes of alternating current are usually 
passed for about .2 second through electrodes placed on the fronto- 
temporal cranium. Subconvulsive reactions or generalised convulsions 
ensue concomitantly with the passage of the current. Wilcox (153) 
conveniently divides the electroshock convulsion syndrome into an ini- 
tial tonic phase followed by a clonic phase, an atonic phase, a stuporous 
phase, and a post-convulsive mental state. 

The voltage of alternating current necessary to produce an effective 
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stimulus is usually more than 100 volts. Cortically initiated muscular 
movements may be evoked by less than 10 milliamperes when the 
stimulation is applied directly to the brain, but in electroshock pro- 
cedures the intervening skull and tissues as well as the reduction of 
current density because of diffusion of current lines of flow make a rela- 
tively high voltage necessary. Also as Hemphill and Grey Walter (61) 
point out, the alternating current stimulus is repetitive. Since electro- 
physiologists believe that the physiologically important stimuli arise at 
the negative pole of the applied current, an adequate electrical tension 
applied for .2 second in 60 cycles of alternating current will result in 12 
effective stimulations of each cerebral hemisphere. The duration of 
current application, therefore, is the moderator of the number of times 
the brain is electrically stimulated. The smaller the number of current 
volleys, the greater is the current strength necessary to induce convul- 
sion. It has also been shown by a number of workers that Ohm's law 
does not apply to the passage of current through tissues. With the flow 
of current the initial resistance is modified by the factor of reactance 
and possibly by capacitance and other determinants so that the meas- 
urement of resistance is of little value in calculating current effect. 

Friedman and Wilcox (41) have used unidirectional wave forms of 
current, including half-sine waves, galvanic current pulses, combinations 
of these two forms, and repetitive condenser discharges. It has been 
stated that some alterations in the wave form of the current produce the 
convulsive effect with less amperage and lessen the post-convulsive dis- 
orientation and dysmnesia. 

Berkwitz (10, 11) introduced an electrical shock method in which 
the stimulation is effected by about 30 shocks of faradic current from 
an induction coil, one half second in duration, with current strength 
under the convulsive level. Two to 45 such treatments are given in 
which fear and some pain are characteristic reactions. He reports in one 
study (11) that 35% of chronic patients so treated showed improvement, 
and one complete remission was obtained. 

Electronarcosis, Reminiscent of Leduc's electrically induced sleep, 
the introduction of an electronic apparatus delivering 160-250 milli- 
ampercs of 60 cycle alternating current continuously in the human 
subject for an initial 30 seconds through bitcrnporally placed electrodes 
has been a recent modification of electroshock therapy. The first 30 
seconds of stimulation result in a tonic spasm with a few seconds of 
cardiac arrest and a 30-45 second respiratory suspension. The current 
is then dropped to 60-70 milliamperes, whereupon a few mild clonic 
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movements may be seen, and then after 60-75 seconds from the initial 
application the current is increased by a five milliampere increment 
every 15 seconds to a maximum of 125 milliamperes at five minutes. The 
major differences from electroshock reactions are seen in a prolongation 
of stimulation of the autonomic nervous system, in prolonged flexor 
tone, and in the presence of forced grasping. These kinds of reaction are 
seen, however, in deep insulin coma, and Thompson (144) and his co- 
workers believe that electronarcosis is similar in therapeutic efficacy to 
insulin treatment and that it is superior to electroshock in the manage- 
ment of schizophrenia. 

Other Methods. The employment of picro toxin, anaphylactic shock, 
nitrogen anoxia, carbon dioxide stimulation, refrigeration, sterile men- 
ingitis, and a few other methods which have been used experimentally 
in an effort to modify human psychoses need not be discussed since 
they have not been used extensively enough to have provided any 
psychological data relevant to the interests of this review. Psychic 
‘‘shock*’ is also being advocated by at least one current writer (35). 

Prolonged Narcosis. Continuous narcosis of several days’ duration 
is used most frequently in contemporary psychiatry in the handling of 
manic excitement and in the treatment of certain battle reactions. We 
must go back at least as far as MacLeod (62) for historical priority in 
the use of Dauerschlaf in the treatment of mental disease, although 
Locock in 1857 had advocated the use of bromide in the treatment of 
epilepsy and in neuroses, particularly hysteria. MacLeod, working in 
Shanghai in 1897, began the treatment of acute mania with bromide- 
induced narcosis of several days’ duration. Previously, of course, the 
ancients had made conflicting observations upon the value of sustained 
sleep in mental disease, and in the middle nineteenth century Andral 
and van Swieten among others had seen maniacal conditions cured by 
accidentally large doses of opium. In 1901 Wolff (62) treated confu- 
sional states with trional, and Epifanio (62) in 1915 was using luminal. 
Klase was encouraged by the results of somnifaine narcosis which he 
had induced in 26 schizophrenics and which he reported in 1922. 

Prolonged narcosis is not^ strictly speaking, a “shock” therapy, but 
a brief consideration is necessary because earlier it was used for all 
kinds of psychiatric disorder and reports, for example, like that of 
Palmer (106) who in a mixed series of cases treated with somnifaine ob- 
tained 33% recoveries are psychologically significant in evaluating the 
specificity of the nature of the shock-induced remissions. Meerloo (93) 
also obtained one-third recoveries in 500 well-mixed cases of psychiatric 
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disord^ following prolonged barbiturate narcosis. Hennelly (62) breaks 

down his remissions to 42% in lOBnia, 32% in tnelancholia (maaic- 
depressive-depressed), and 44% in involutional melancholia. 

Treatment Combinations. Enrly in the use of insulin, the hypogly- 
cemic coma was interrupted by metrazol convulsions in order to obtain 
the therapeutic effect of the epileptiform seizures. Electric shock has 
also been used in conjunction with insulin. Moreover, failures with one 
procedure have been subjected to other kinds of shock therapy. There 
are psychiatric patients even who have initially undergone insulin 
shock, then a series of electroshock treatments, and who have then 
finally been subjected to frontal leucotomy. One of Freeman's and 
Watts’ patients, quite interestingly, passed through a psychoanalysis, 
an insulin shock treatment, and two frontal leucotomies, and she is 
currently engaged in working on her doctorate thesis (39). 

Adequate statistics and careful definition of the characteristics of 
pre- and post-treatment personality in these cases where more than one 
shock procedure has been used would be most helpful in the evaluation 
of the effects of these different shock methods. However, the literature 
is rather disappointing in this respect. Patients who have not responded 
to one kind of shock procedure may occasionally benefit by the applica- 
tion of another convulsion- or coma-inducing method. However, where 
such a change in the treatment technic is made with favorable results, 
this successful issue may in many cases very probably be due to the 
total length of the combined treatment and not to the change in pro- 
cedures, since some patients may not show adequate modification of the 
psychotic symptoms until after they have experienced 20 or more con- 
vulsions. There is some indication that insulin is more effective than 
other methods in the treatment of schizophrenia, and hence it may very 
well occur that some of such cases may show a disappointing reaction to 
metrazol or electricity and yet improve as a result of the hypoglycemic 
regime. 

More adequate research substantiation is needed before the supe- 
riority of any one shock procedure is established for the treatment of 
any specific psychiatric syndrome. More and more it appears that the 
intensity of treatment, of whatever kind, is the important considera- 
tion, and it may very well be, as Kalinowsky (70) suggests, that even 
manic excitement may be controlled by shock therapy provided the 
confusional form of treatment described by Lowenbach (86) is used, al- 
though Lowenbach, himself, believes that convulsive treatment is of 
little value in mania. 
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Illustrative of encouraging results obtained by a change of method is 
the report of Weinberg and Goldstein (ISI) who observed that 32% of 
100 patients treated with insulin following failure to respond to metra- 
zoJ showed good improvement. 

Combination with Other Drugs. Various substances, such as curare, 
bcta-erythroidin hydrochloride, or quinine methochloride may be com- 
bined with the convulsive agent in order to lessen the motor severity of 
the convulsion. Shock treatment has also been given with the patient 
under general and spinal anesthesia. Sodium amytal may be used to 
aid in the control of the resistive and apprehensive patient and in order 
to forestall post-convulsive excitement. Scopolamine has been similarly 
used. Cocaine has been employed to induce a pleasant post-convulsive 
emotional state. Magnesium sulfate has been combined with convul- 
sion-evoking agents, and strychnine has been used to increase the sen- 
sitivity of the nervous system to metrazol. 

Combination with Psychotherapy. Considerable controversy exists 
among psychiatrists as to whether shock treatment should be combined 
with active psychotherapy. In the case of any formal analytic process 
demanding continuity of progress and of the experience of therapeutic 
"movement," this problem is solved rather incontrovertibly in most 
cases by the inability of the patient to remember either the analyst or 
the content of the analytic sessions from one treatment day to the next. 
However, in many places where restrictions on time and personnel are 
not too drastic pragmatic psychotherapy is usually attempted. Even 
if the patient has amnesia for the period during which psychotherapeutic 
relations could be established, such interpersonal experience may quite 
conceivably be of considerable therapeutic importance even if it is not 
immediately reflected in prognosis. We must be aware, however, that 
Myerson (98), for example, allowed some of his ambulatory patients to 
be treated by electroshock with minimal personal contact, and he re- 
ported as favorable results with this group as with patients additionally 
aided by psychotherapy. Nevertheless, Orenstein and Schilder (102) 
pointed out some time ago that there is a difference in the kind of in- 
sight socially recovered patients may have, depending upon whether 
they have become aware of tfie psychodynamics of the self through the 
process of psychotherapy or whether they simply know, following suc- 
cessful shock therapy alone, that they were once ill and are now, some- 
how, well again. 

The failure of later investigators using the insulin technic to achieve 
the good .initial results reported by Sakel may be due in part to differ- 
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ences in the intensity and quality of psychotherapy employed by the 
different clinics. Sakel himself was willing to attribute 20-30% of the 
total insulin-induced remission percentage to the concomitant use of 
psychotherapy. 

Convulsive therapy, because of the post-convulsive accessibility and 
helplessness of the patient, may provide the first opportunities for the 
establishment of psychotherapeutic relationships with a mute, inac- 
cessible, or negativistic subject. Moriarity (65) believes also that the 
combination of convulsive therapy with psychotherapy yields better 
results than psychotherapy alone in treating the neuroses and the bor- 
derline psychoses. This observation, however, must be tempered by 
the consideration of the general agreement of opinion that shock therapy 
alone is of little or no value in the treatment of the neuroses, with the 
possible exception of the reactive and anxiety depressions and some of 
the more blatant hysterias. 

Duration of Treatment. Insulin coma is usually induced six days 
weekly for at least one month so that as many as 30 to 40 single treat- 
ments may be administered and in some cases, many more. Metrazol 
and electricity were used generally to induce convulsions about two or 
three times weekly to a total number of 10 to 15. The emphasis soon 
came to be placed upon an adequate number of convulsions (at least 20) 
and upon a more frequent occurrence so that currently several electro- 
shock convulsions may be administered even in the same day. Lowen- 
bach (86), for instance, has stressed the necessity of achieving long- 
lasting disorientation in psychotic subjects by frequent convulsions as 
one requisite for successful therapeutic effect. Freeman (39), also, con- 
siders frequent convulsions and the resulting enduring disorientation 
desirable. However, the decision about the spacing of convulsions de- 
pends upon what the effect upon the patient of shock treatment is 
thought to be as well as upon the personality reaction being treated. 

Observations of Immediate Treatment Reactions 

General Clinical Description. The actual behavior of patients react- 
ing to convulsion-precipitating agents has been described with consider- 
able agreement by many observers. Lowenbach and Stainbrook (87) 
have stated that a generalised convulsion results immediately in a state 
in which it is impossible to demonstrate operationally any behavior 
subsumed under the psychological conception of personality. **Thc in- 
dividual does not react to any kind of stimuli and the activity of the 
electroencephalogram has almost ceased,” The convulsion is then suc- 
ceeded by the gradual return and reintegration of personality functions. 
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According to these writers, ''it seems as if the subconvulsive reaction 
does not differ from the convulsive except in degree, but in these minor 
reactions . . . reintegration may begin at any level and the recovery 
may be much more quickly and indeed sometimes imperceptibly 
achieved.” Cohen (23), studying the return of cognitive function fol- 
lowing metrazol convulsion believes that the continuum of recovery 
can be rather adequately defined in terms of the initial and immediate 
post-convulsive anesthesia, aprosexia, agnosia, apraxia, and amnesia. 
After four minutes usually, tactile, visual, and other sensations begin to 
show evidence of function, and there then follows in roughly chrono- 
logical sequence but with much overlapping, of course, and with pro- 
gressively decreasing exhibition of dysfunction, the recovery of atten- 
tion, gnosia, praxia, and memory. Stainbrook (137) writes that the 
cognitive behavior observed clinically following electrically induced con- 
vulsions may be considered conveniently as consisting of immediate 
transitory effects most economically conceived as reflections of severe 
neurological dysfunction and of a more remote and relatively enduring 
symptomatology of disorientation and dysmnesia. 

Ismael’s (69) observations on patients during hypoglycemic induc- 
tion recorded anomalies in sensory perception, including hypnagogic 
visions and hallucinatory images which were primarily visual but which 
occurred in all sensory spheres. He also saw what he inferred to be 
anxiety, terror, euphoria, rage, and panic. Misidentification of persons 
existed as well as disordered gesticulation and dyslalia. Evidence of 
exteriorized sexuality in the forms of homosexuality, sadism, and ex- 
hibitionism was also observed. Like von Angyal (5), Benedek (9), 
Palisa (105), Plattner (114), and others, Ismael was also able to demon- 
strate disturbances of the body-image. These latter investigators found 
that the hypoglycemic state produced changes in the perception of 
space and of color qualities and disturbances in gestalt comprehension 
and in concept formation. Pisk (113) described the appearance of the 
”Zeitraffer” phenomenon in insulin coma and reported a patient who 
experienced acceleration in the flow of time. Orenstein and Schilder 
(102) also report alterations qf time perception occurring during insulin 
reaction. Benedek (9), quite interestingly, detailed disturbances in the 
perception of movement which were similar to those observed by Potzl 
and Goldstein and Gelb in occipital lobe organic lesions. Benedek also 
reports that micropsia, megalopsia, and primitive optic hallucinations 
appeared and that colors appeared less impressive to patients during 
shock and looked more saturated after shock was over. 

Silbermann (130), from his studies on subjects treated with insulin 
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and triazol, describes two groups of symptoms, one group being classi- 
fied as regressive and occurring in the interval between the injection 
and the onset of the coma or convulsion and one group being seen after 
''shock” and described as restitutive. Among the regressive symptoms 
he would place feelings of giddiness and ill-defined feelings of apprehen- 
sion, sensations of hot flushes, excitability, disturbances in the evalua- 
tion of shape, distance and size, abnormal sensations of color, distor- 
tions of auditory impressions, feelings of unreality, guilt and fear of 
punishment, confusion, loneliness, feelings of world-destruction, and fear 
of death. The immediate post-shock, or restitutive, group of symptoms 
includes, characteristically, feelings of fear, confusion, and unreality as 
well as motor and sensory aphasia, euphoria, and a sense of being like 
a helpless child. Good (51) has also described the pre-convulsive aura 
following metrazol injection as evidencing apprehension, perplexity, 
strangeness, and choking sensations and as culminating in intense fear 
and terror. Friedman (40) in the same pre-convulsive reactions saw 
•'rapidly mounting anxiety which in certain cases rose to a vivid deliri- 
form panic-state with concomitant characteristic vasomotor and psy- 
chomotor overactivity.” Collins (24), too, stresses the emotional re- 
sponses of fear and apprehension in the aura of metrazol convulsions 
and in the hypoglycemic pre-coma, and Hemphill (60) writes that fear 
of death may be consciously expressed by metrazol-treated patients. 
On the other hand. Cook (25), using a scale of the degree of fear ex- 
hibited by cardiazol and triazol treated patients as rated by attending 
nurses, concluded that the intensity of fear in 275 cases as recorded and 
correlated with the outcome of treatment offered no evidence for assum- 
ing that fear exerts any curative influence. 

Interestingly, Dyne and Tod (30) in a comparative study of reac- 
tions to subconvulsive and convulsive doses of triazol found that non- 
schizophrenic mental patients reacted more prominently with fear and 
anxiety than did a group of "emotionally deteriorated” schizophrenics. 
Good (51), however, discovered that both psychoneurotic and psychotic 
patients showed the same fear, and he also saw no differences between 
these groups in their immediate post-convulsive behavior. He noted, 
though, that the post-convulsive disorder of dysmnesia and disorienta- 
tion was of much shoiter duration in psychoneurotics than in psy- 
chotics, and this observation supported his contention that shortness of 
post-convulsive confusion and disorientation is a favorable prognostic 
sign. Lowenbach and Stainbrook (87) have put forth the opposite 
theory that a long post-shock disorientation after single convulsions is 
a good prognostic index. 
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Hemphill (60) postulates three psychological events in the course of 
convulsive treatment: (1) the realization of being treated, (2) the return 
to reality from the death-like state of the epileptic fit, and (3) the en- 
counter with the environment after reestablishment of consciousness. 
The observation is also made that the patient is able to identify and 
name parts of his own body before objects in the external world, and it 
is therefore inferred that self-interest or awareness of the ego returns 
first in post-convulsive recovery. Opposed to this observation is that of 
Schilder (126) who states that, after metrazol convulsions, patients are 
able to name objects before they can designate correctly parts of their 
bodies. Additionally, Hemphill remarked that the use of certain emo- 
tionally descriptive adjectives may be seen as evidence of exaggerated 
affective attitudes existing in the immediate post-shock state. The 
post-convulsive subject is also described as feeling dependent and at- 
tached to persons in the environment and as assuming infantile atti- 
tudes. 

Kalinowsky and Kennedy (73) have observed that post-convulsive 
phenomena following electric shock are surprisingly constant in each 
individual and they believe that convulsive and post-convulsive be- 
havior follow the predetermined pattern of the individual and that these 
reactions are not influenced by the type of stimulus nor by previous 
medication. 

Psychoanalytic Description, Rankine Good (51) divides the immedi- 
ate post-convulsive state into four stages reminiscent of the analytic 
conception of sexual genesis: (1) the primary narcissistic stage, (2) the 
oral phase, marked by retention of gag, sucking and chewing move- 
ments, and spitting, (3) the anal phase, in which fecal smearing and 
coprophagia may occur, and (4) the phallic stage, during which finger- 
ing the genitals, clawing the vulva, exhibitionism, beating the genitals, 
and masturbation may be seen. It must be remembered, however, that 
these activities do not occur in every patient and that some of them are 
seen while the patient is operationally unconscious. Possibly as Lowen- 
bach and Stainbrook (87) have suggested, some of these movements like 
sucking and chewing may be psychologically meaningless and may point 
to a cortical area being resistant to convulsive exhaustion. Mayer- 
Gross (90), discussing these oral and facial movements from a neuro- 
logical point of view, suggests also that such movements may be moti- 
vated by hunger, at least in hypoglycemia. However, since sucking and 
chewing behavior may be seen after electroshock as well, it is unlikely 
that hunger is the explanation. Interestingly, Larkin (83) feels that he 
can prevent delayed insulin coma by stimulating oral sucking and by 



38 \ 


EDWARD STAINBROOK 


“mothering" the patient by embracing and stroking him after glucose 
has been given to terminate the hypoglycemia. Abse (3) sees a con- 
firmation of psychoanalytic theory in his challenging but inade- 
quately demonstrated observation that, post-convulsively, manic-de- 
pressive patients show predominantly oral activities, while paranoid 
schizophrenes express primarily anal interests, and degenerative hys- 
terics display mostly phallic behavior. 

Specific Description, Orenstein and Schilder (102), and Schilder 
(126), used some of Wertheimer’s original gestalt figures as adapted by 
L. Bender for a “visual motor gestalt perception test" in the study of 
gestalt organisation during and immediately after insulin reaction and 
in the post-convulsive metrazol period. They found disturbances in 
gestalt function to consist of (1) a tendency to perseveration, (2) sub- 
stitution of circles and loops for points, (3) substitution of curves for 
angles, (4) substitution of uninterrupted lines for dotted lines, (5) 
changes of angles into straight lines, (6) rotation of figure parts, and (7) 
spatial separation of gestalt units. Stainbrook and Lowenbach (140), 
using these same gestalt figures over the period of post-convulsive 
reintegration and getting repeated drawings of a gestalt model at vari- 
ous time intervals after convulsion, found similar disturbances. They 
also noted that the earliest attempts at copying the Wertheimer designs 
were reactions to the whole figures, a simple loop separated in space be- 
ing made for each obvious element in the model. In the recovery se- 
quence angularity was the next feature to be correctly represented, 
then the units of the figure were slowly brought into proper spatial rela- 
tionships, and finally, near the end of the post-convulsive recovery 
period, the parts of the figure were represented joined together as in 
the model. 

These same investigators also studied the return of the writing func- 
tion following convulsion. Perseveration and more tendency to comple- 
tion of the task for the patients’ own signatures than for other sug- 
gested phrases were noticed in the early periods. The first post-convul- 
sive writing attempted had in spite of tremor and incoordination all the 
formal characteristics of the individual’s ordinary handwriting. Fre- 
quently patients reverted to writing the names associated with child- 
hood or adolescence or to the diminutive forms of their names or to a 
phonetically simpler spelling. Married women frequently wrote, their 
maiden names. Printed letters occasionally appeared, and once or twice 
a substantive within a sentence was capitalized. 

Moore (95) has compared the reactions of a patient in a mild hypo- 
glycemic state with the pretreatment behavior, using Street’s gestalt 
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completion test, serial subtraction of seven from 100, and the Gold- 
stein modification of the Koh*s Block Design test. He found fewer figure 
interpretations of the gestalt test, the appearance of finger-counting as 
an aid in serial seven subtraction, and inability to “abstract** the de- 
signs of the Koh*s blocks. It was concluded that analytic and synthetic 
thinking was impaired with greater dependence upon sensory impres- 
sions, a behavior consistent with Goldstein’s hypothesis of the existence 
of a “concrete attitude” or loss of the “categorical attitude.” 

Fingent, Kogan, and Schilder (34) have asked schizophrenic patients 
to draw the figure of the Goodenough test during the awakening from 
the effects of metrazol seizures or from insulin coma. They have ob- 
served that the drawings show signs of organic confusion in gestalt per- 
ception and representation. However, the typically schizophrenic 
handling of the test is seen when the organic confusion expressed in the 
drawings of the early post-convulsive stage subsides. The conclusion 
is recorded, therefore, that the treatment introduces a new element, the 
organic confusion, into the picture and that the effect of insulin and 
metrazol is not a direct attack upon schizophrenic structure but upon 
“deeper seated structures” leading to their reconstruction and reorgan- 
ization. If a concomitant reorganization of the schizophrenic process 
takes place, it is “reflected in the gradual loosening of the schizophrenic 
characteristics of the drawings.” 

Following Angyal’s (6) neurologizing assumption that the symptoms 
of hypoglycemia may be encompassed within six insulin-shock syn- 
dromes, (1) frontal lobe, (2) ontogenetic, (3) aphasic-amnesic, (4) static- 
paresthetic, (5) coenesthetic, and (6) parieto-occipital, Gyfaras (55) has 
observed two kinds of spontaneous drawing disturbances in hypo- 
glycemia. One group of disturbances is considered to be frontopolar in 
origin with evidence of elimination of inhibition, conventions, and 
schemes of thought, and a general reduction and regression to a more 
primitive personality is seen. This inferred regression is entirely absent 
from a second group, (the parieto-occipital syndrome), where disturb- 
ances of drawing due to the dysfunction of the “constructive faculty” 
prevail without infantile and ^primitive personality traits, probably a 
kind of drawing behavior reflecting a disorder like the constructive 
apraxia of Klcist. 

Vujic and Ristic (146) have studied disturbances of colored after- 
images during hypoglycemia and they report that on the average the 
duration of after-images is reduced by 68% and that complementary 
color is lost in certain cases. In 50% of the subjects there was a tempo- 
rary complete loss of after-images. 
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Kao and Lyman (76) found that electric shock treatment abolished 
in an eidetic patient the prolonged, mobile transformations of the eidetic 
imagery. Afterwards, the eidetic responses regained their former vivid 
and colorful character and returned to practically the same state as 
before treatment was instituted. 

On the basis of electromyographic and cinematographic studies of 
single metrazol seizures, Strauss, Landis, and Hunt (141) divided the 
convulsion into a first clonic stage, a tonic stage, and a second clonic 
phase. The second clonic phase is considered to be like the tonic phase 
except that interruptions in the continuity of innervation appear. The 
first clonic stage is different from the second clonic stage and is probably 
due to cortical stimulation. The tonic stage corresponds to a state of 
decerebrate rigidity. 

Kino (79) found that the characteristics of dermographia did not 
change before and after single electroshocks, but Fortuyn (37) saw 
alterations of dermographia in the hypoglycemic state, consisting of a 
decrease in the time of latency and the appearance of a broader red 
line than normally present. The white area surrounding the red line be- 
came regressive or was even absent. 

Kino and Thorpe (80) also described the occurrence of the grasping 
reflex in post-convulsive stages of electrically induced seizures and 
found, quite interestingly, a difference in frequency of the appearance 
of this reflex in acute schizophrenia and in manic-depressive psychosis. 
It would be desirable to have further confirmation of this observation 
that the expression of a neurological reflex is selectively influenced in 
the post-convulsive state of patients with different mental diseases. 

Using a modified form of the Rorschach test presentation. Stain - 
brook (137) was able to assemble composite Rorschach psychograms in 
many cavses for each five minute interval following the onset of an elec- 
troschock convulsion. He concludes that progressive increase in pro- 
ductivity and accuracy of form conceptualization, disappearance of 
perseveration and simple color-naming responses which appear in the 
earliest records, and the gradual appearance of FC, or form-color re- 
sponses following a phase of primarily CF, or color-form answers were 
the main Rorschach indices of change in the post-convulsive period. 
Movement responses were always the last Rorschach concepts to re- 
appear in the records. 

Flescher (36) studied the extent and duration of retrograde amnesia 
following electric shock in 18 early schizophrenics, giving various mem- 
ory tests before the shock and then again at times varying from a few 
minutes before shock to four to seven hours after. It was concluded that 
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material which does not appear spontaneously or through association or 
recognition at the end of this time is permanently lost. Hemphill (60) 
investigated the characteristics of the recall of eight pictures presented 
to his patients about 30 minutes before the injection of metrazol and 
again at about an hour after the convulsion. He writes that no patient 
failed to recollect having seen the pictures and that no patient remem- 
bered less than three pictures and that three patients were able to de- 
scribe all the pictures. He also insists that the absence of retrograde 
amnesia is easily demonstrated after electrically induced convulsions. 

Ten patients treated by electric convulsive therapy were taught 
paired word associations by Zubin and Barrera (162) before treatment, 
and their retention was then tested after single shocks by recall, relearn- 
ing and recognition methods. A control series of associates had been 
learned and tests of retention given a week before treatment. They 
found no significant saving in the experimental series, although the 
learning ability was not impaired. The recall and recognition scores 
also showed evidence of memory loss. They concluded that shock af- 
fects material learned immediately before the convulsion more than 
that learned less recently. Zubin (161) also introduced interference in 
the form of new associations to the original words and found that such 
interference was accentuated after shock. It therefore appeared that 
electroshock disorganizes but does not destroy memory traces. Rod- 
nick (120), publishing the results of one of the few well-devised experi- 
ments of the effect of therapeutic shock upon habit systems in the hu- 
man subject, persuaded 21 schizophrenes to learn two similar but an- 
tagonistic simple habits of moving the finger either to the right or left 
depending upon the frequency of a tone. Twenty-four hours intervened 
between the learning of the first habit and the similar but opposite 
second habit. Patients were retested one and a half hours after metrazol 
shock to determine habit dominance. He found a statistically higher 
number of reversals to the older habit in the §hock group than in a sim- 
ilar control group. This indicated to him that metrazol exerted a differ- 
ential effect with reference to older and to more recently acquired 
habits. 

Wiedeking (153) used tests like Bourdon’s attention test and a word- 
association test and demonstrated in normal subjects who voluntarily 
submitted to Sakel’s technic of inducing hypoglycemia that a parallel- 
ism existed, with small doses of insulin, between mental impairment and 
falling blood sugar. With high insulin doses there was a continuous drop 
in mental efficiency regardless of the variation in the blood sugar level. 

Subjective Description, Wiedeking (153) reported the subjective ex- 
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periences of three medical students who volunteered to subject them- 
selves to insulindnduced hypoglycemia. They described feelings of 
marked hunger, heaviness of the limbs and general weakness, feelings of 
being tired and wanting to be left alone. The mood was one of apathy. 
Consciousness varied in phases with “clouding." A feeling of blankness 
developed out of difficulty in thinking. Disorders of sensation were fre- 
quent. Entoptic phenomena and pseudo-visual hallucinations occurred. 
On awakening, euphoria and a sense of having been saved from danger 
were marked. 

Gillespie (47) published the results of an inquiry into the subjective 
sensations of 23 mental patients undergoing metrazol treatment, but 
because the patients' accounts tended to be distorted by their mental 
condition he, himself, underwent a metrazol convulsion. He described 
distinctly unpleasant feelings with headache, anorexia, and malaise for 
many hours afterwards. There also existed some retrograde amnesia 
even 10 hours after the convulsion. 

Watkins, Stainbrook, and Lowenbach (147) reported a subconvul- 
sive electric shock reaction in a 2S-year old normal physician in which 
absence of somatic and mental complaints and the presence of amnesia 
and disorientation were the outstanding features. Immediate post-con- 
vulsive copying of a gestalt figure was accomplished initially by making 
dextrad circling movements, following which the subject attempted to 
“close in" on the model and to draw directly on the figure, behavior 
similar to that seen in psychotic patients in the early post-convulsive 
period. Some dysmnesia in the ability to designate concepts was evi- 
denced by the Rorschach test. The existenceof anterograde amnesia was 
also displayed on the Rorschach blots by the continual repeating of 
respouvses which he had already given and which he had forgotten had 
once been given. 

Fraser and Sargant (38) have published written accounts by schizo- 
phrenics at the end of an insulin treatment course in which their sub- 
jective experiences during their illness are detailed. No attempt was 
made to analyze these personal records and a reading of the five typical 
letters published gives no knowledge of how these patients interpreted 
either a single insulin reaction or the total treatment. 

Observations During and After Treatment 

General Description, Polatin and Spotnitz (115), describing an am- 
bulatory insulin shock technic with which they treated 44 schizophrenic 
patients with 82% showing clinical improvement, write that such im- 
provement occurs in four progressive stages: (1) improvement in spon- 
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taneity, (2) greater interest in the environment, (3) dissolution of psy- 
chotic ideas, and (4) readjustment to the ordinary ways of living. 

Harris (58) reports that the characteristic picture seen in schizo- 
phrenic subjects treated by cardiazol was one of euphoria and mild 
overactivity. The euphoria is described as causing paranoid patients to 
forget their grievances at least for a time. This writer also remarks that 
cases of catatonic stupor passed into catatonic excitement or into hebe- 
phrenia if they did not recover and that mute and inaccessible patients 
began to talk, that untidy patients became cleaner, and that the unoc- 
cupied began to work. Schilder (126) thought that, with metrazol, 
catatonic symptoms very often disappeared first while paranoid ideas 
frequently remained. 

Friedman (40), employing metrazol, stressed the changes revealed 
in spontaneous verbal and written expression in 70 chronically ill pa- 
tients, some of whom began to write letters after having not done so for 
years. This observer also felt that there soon occurred a fear of somatic 
organ changes in almost every case, and he saw behavior which he de- 
scribed as “an expression and release of previously dissociated, im- 
mature, and pathologic sexuality.” Alterations in affect, increased sen- 
sitivity, and sudden impulsive-destructive reactions directed either out- 
wardly or towards the self and occurring in previously apathetic or 
stuporous individuals were also noted. 

Feldman, Fiero, and Hunt (33), describing suspicion, agitation, fear, 
and fairly well-systematized ideas of persecution in patients before 
treatment, saw that, following treatment, these patients displayed per- 
sonality changes from the schizoid characteristics to overt, well-bal- 
anced social attitudes. However, these changes were mostly only tempo- 
rary. One case, interestingly, became psychotic again with a hypomanic 
extraverted reaction in contrast to the pre-treatment introverted schiz- 
oid personality. Blair (13) has noted that the firmly rooted delusions 
of the typical paranoid schizophrene are rarely affected by shock treat- 
ment but that delusions of a superficial and bizarre nature attached to or 
resulting from changes of mood may disappear. Bain (8) also had little 
success with metrazol therapy when there existed either fixed delusions 
or pronounced apathy. 

Cheney, Hamilton, and Heaver (22), in treating patients with 
metrazol, observed that the symptoms of depressed and agitated states 
were supplanted by a pathological though not marked elevation of 
mood. Slight hypomanic states usually followed recovery from depres- 
sion and manics occasionally appeared slightly depressed after electric 
shock' therapy in the hands of Kalinowsky, Bigelow, and Brikates (72). 
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There is general agreement that shock therapy is most valuable in 
the treatment of depressive states. As Kalinowsky (70) observes, a 
depression can be cut short at almost any time by adequate shock treat- 
ment. 

Kant, Phillips, and Stolzheise (74), presenting a description of shock 
treatment in schizophrenia, state that hallucinations, body sensations, 
and all unusual symbolic expressions are the first symptoms to disap- 
pear during treatment, but they have observed that, although the 
voices cease, the ideas of persecution remain, and although the patient 
may be no longer tormented by “electric rays," the idea of being in- 
fluenced is vaguely present. Silbermann (130) has also noted that hal- 
lucinations may often disappear during treatment, and Moore (94), who 
discussed the maintenance treatment of chronic psychotics by electri- 
cally induced convulsions at the rate of about four to six a month, de- 
scribed the temporary disappearance of hallucinations. Gruenberg (54) 
found in treating schizophrenics with high frequency current that to- 
wards the end of treatment auditory hallucinations became subjectively 
less loud, less clear, and more distant. He applied this treatment with 
some success in alcoholic hallucinosis. 

y Rosas (122) used metrazol to treat hysteria and obsessional neurosis 
and concluded that the most resistant symptoms were those of a com- 
pulsive nature, particularly the obsessions and the phobias. Metrazol 
was used, too, by Owensby (103) in the therapy of homosexuality and 
lesbianism and he reported six such cases as “cured" with in some cases 
establishment of normal sex relations and no return of homosexual de- 
sires or tendencies for as long as 18 months after treatment. Liebman 
(84) subjected a psychotic, transvestic male homosexual to electric 
shock and obtained remission of the psychosis, disappearance of the 
transvestism, but only some inhibition of the homosexual behavior. 

Weigert (149), who is a psychoanalyst and therefore more critically 
resistant to overenthusiasm about shock treatment, writes that the pa- 
tients are changed in emotional behavior during shock therapy for a 
more or less limited time without having changed their fundamental 
attitudes towards life-problems. But Kalinowsky (70), too, agrees that, 
at least in psychoneurotic depressions, although the depression is favor- 
ably treated, the neurotic attitudes remain. Varying the emphasis 
somewhat, Schilder (126) thought that the “psychoses and the psy- 
chotic symptoms are not forgotten" but that the individuals have 
changed their emotional attitudes. Berrington (12) has published a few 
very interesting expressions from cardiazol-treated patients which sub- 
stantiate the attitude change towards less intensity in the activation of 
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mental content. One patient, for example, said that “It (the treatment) 
has held my imagination in curb. There is nothing I am hunting for.** 

A probing psychodynamic approach in patients undergoing metra- 
zol therapy does not seem to have the disturbing emotional aftermath 
that such an analysis frequently evokes in individuals not in convulsive 
treatment, according to Cheney, Hamilton, and Heaver (22). Eissler 
(31), who was primarily interested in an analytic evaluation of metrazol 
treatment, felt that schizophrenic patients so treated showed, in addi- 
tion to lack of self-observation, a lack of emotional depth and a with- 
drawal from personal contacts. He also thought that these patients* 
dreams had, after treatment, degenerated to expressions of predomi- 
nantly simple overt wish-fulfillment such as Freud originally believed 
characteristic of small children. And referring to changes in the dreams 
of patients undergoing shock therapy, an area of research which has re- 
ceived but little attention. Boss (18) believes that if there is no change 
in the dream-content of patients being treated by convulsive therapy 
the prognosis is poor. 

Moore (94), treating some patients by a maintenance regime of four 
to six convulsions monthly for in some cases as long as two years, insists 
that clinical observation of these individuals shows no deterioration in 
personality or intelligence. This observation, however, is not objectively 
substantiated. Neymann, Urse, Madden, and Countryman (100) also 
saw no dementia or flattening of the personality among the patients of 
their recovered group of schizophrenics, manic-depressives, and chronic 
alcoholics subjected to electroshock therapy. 

Kalinowsky (71) believes that all patients undergoing electric con- 
vulsive therapy show an early impairment of memory as well as emo- 
tional disturbances. He does not feel, however, that learning is grossly 
affected and cites an interesting case of a refugee physician who was 
treated for depression until he was deeply confused and had confirmed 
Petri’s rule by losing his ability to speak English. Nevertheless, a few 
months later he passed a state board medical examination. Smith, 
Hastings, and Hughes (133) mention memory changes as always occur- 
ring to some degree with electroshock therapy, but they write that 
“these memory defects do hot seem to be permanent.” Schilder (126), 
too, thinks that the amnesia seen after metrazol treatment completely 
clears. 

Whatever may be the reports concerning the complete disappear- 
ance of shock-induced memory impairment, no one who has talked to 
patients who have undergone electroshock treatment can doubt that 
there is a considerable amount of experience surrounding and during 
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the course of treatment which remains permanently inaccessible to 
memory. Quite parenthetically, it is interesting to note that Ingalls 
(68) successfully treated a case of hysterical amnesia by metrazol- 
induced convulsions. After the fourth convulsion, his patient is re- 
ported to have regained complete memory of his past. 

Harris (58) remarks that the majority of his cases gained weight 
but that the weight gain was independent of mental changes. There oc- 
curred a marked weight increase even in patients who showed no re- 
covery. Reports of gain in weight are perhaps the most consistent and 
universal treatment results described by users of shock therapy, of 
whatever type, and psychosurgery. 

Early in the history of insulin therapy considerable attention was 
paid to the acute psychotic picture sometimes occurring during treat- 
ment. Sakel (125) called this the activated psychosis. However, as 
Kalinowsky (71) has observed, “all the known varieties of symptomatic 
psychoses occurring in infectious and toxic diseases can be seen in a long 
course of electric shock treatments.” These states are reversible and, as 
Glueck and Ackerman (50) point out, these acute psychotic episodes 
are not like actual psychoses in that they are (1) acute and unstable, 
and because (2) the emotional display is genuine and reflects depth, 
(3) the mental content is chaotic, (4) the patient makes more direct 
demands upon the environment with less recourse to veiled symbolic 
expression, (5) the patient reacts to frustration of demands more posi- 
tively and energetically and with more direct expression of hostility, 
and (6) the patient exhibits primitive infantile but strong transference. 

Specific Description. Much of the early psychologic research in shock 
therapy centered about attempts to arrive at factors which might afford 
a basis for the selection of patients for successful therapy. Holies, Rosen, 
and Landis (17), using the Vigotsky, Weigl, and BRL sorting tests on 
19 schizophrenes treated by insulin, concluded that the patients whose 
performances were superior before treatment showed most improve- 
ment and that the patients with poor pre-treatment performance 
showed little or no improvement. 

Skottowe (132) felt that patients who showed what he called dys- 
symbole, (“a state of mind which manifests itself by the inability of the 
patient to formulate his conceptual thoughts upon personal topics or to 
discriminate the gradation of his emotions in language which is intel- 
ligible to others . . . notwithstanding that he may be in a state of clear 
consciousness”), do not respond favorably to shock therapy. Thomas 
(143), believing dys-symbole to be pathognomic of true schizophrenia, 
observes, that in 32 cases which he treated with either insulin or cardi- 
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azol, no case showing unequivocal signs of dys-symbole made a good re- 
covery, nor did any recovered case at any time show evidence of this 
kind of thinking. 

Piotrowski (110), using the Rorschach test, felt that, on the whole, 
the difference in pre- and post-treatment records paralleled the clinical 
improvement. This improvement was manifested on the Rorschach 
test by (1) improvement in speed and ease of answering, (2) better 
logical content of responses and less mixing of description and in- 
terpretation, (3) an increase in the number and quality of movement 
responses, (4) an increase in the number and percentage of form-color 
concepts, (5) an increase in percentage of sharply perceived forms, and 
(6) good percepts and integrating capacity. Signs of predictive value 
for good prognosis were devised (112) and described as (1) variety, 
indicating no concepts used more than twice, (2) generic term, referring 
to patient’s attention to logical hierarchy, (3) evidence, or self-critical 
evaluation of responses for ^'adequacy of fit,” (4) color response, or at 
least one color interpretation except simple color-naming, (5) indirect 
color, or evidence of attention to color-areas, and (6) demurring, or 
holding back one or more responses. Elsewhere, Piotrowski (111) has 
reported that patients who give both color and human movement 
responses to the ink-blots have the greatest chance of benefiting from 
insulin therapy. As seen in the performance on the concrete-abstract 
tests, Piotrowski also concludes that on the pre-treatment records 
the patients who improved with shock therapy were functioning on 
”a higher intellectual and emotional level” than was the pre-treatment 
unimproved group. 

Halpern (57) studied with the Rorschach inkblots 17 schizophrenes 
before and after insulin treatment and found that the improved and 
unimproved differed reliably in that the improved group gave greater 
number of responses, five times as many movement responses, more 
color responses, and a greater number of human concepts. Eisner and 
Orbison (31) also gave the Rorschach test before and after metrazol 
therapy and decided that the patients who benefited were “more 
emotionally inhibited, more constricted, and more socially withdrawn” 
than those who did not benefit. Moreover, these investigators felt that 
no matter what clinical improvement occurred these patients “re- 
mained unequivocally schizophrenic after metrazol.” 

Morris (96) administered the Rorschach test to 41 patients under- 
going metrazol therapy and applied the chi square test for the validity 
of his differential signs and concluded that patients tending to remain 
unimproved gave 15% or more anatomy responses and more than two 



48 « EDWARD STAINBROOK 

color-form concepts and less than 70% good form responses in their 
pre-treatment records. 

Kenyon, Rappaport, and Lozoff (77) used the Rorschach test, the 
Babcock deterioration index, and the Szondi test before and after 
metrazol treatment of three paretics. To the Rorschach blots the 
paretics reacted with less total color expression following treatment. 
The Babcock test scores improved with clinical improvement, but, of 
course, marked deterioration existed even in the fever- and metrazol- 
improved patients. 

Kisker (81) indicated that the pre-treatment Rorschach picture of 
insulin- or metrazol-recovered patients was characterized by lack of 
concentration, numerous incidental remarks, good perception ac- 
companied by uncontrolled, extensive associations, and an unevenness 
of performance. He also observed a falling off of the mean M or move- 
ment score after the start of pharmacotherapy and the appearance of 
confusion after 20-30 shock days. Like almost all users of the Rorschach 
blots in the study of shock-remitted psychotics, he noticed that some 
patients with good clinical improvement still showed psychotic charac- 
teristics in their Rorschach performance. Weil (ISO), too, writes that 
psychotic features appear in the Rorschach situation even after an 
insulin-treated patient appears clinically cured. 

Using standard intelligence tests, Wittman (156) found that test 
scores increase progressively after metrazol treatment begins until the 
ninth convulsion when they gradually decrease. She and Russell also 
noted that the intellectual performance improved in interest, attention, 
and social responsiveness (157). 

Wechsler, Halpern, and Jaros (148) made a comparative psycho- 
metric study of mental efficiency of schizophrenic patients immediately 
before and after insulin treatment. The performance on Wechsler's 
vocational interest blank, a test of counting by three’s, naming words 
in three minutes, and a similarities and a directions test gave a correla- 
tion of .73 and a correspondence of 87% with a clinical appraisal of the 
patients’ condition six to 18 months after the end of treatment. The 
increase in the number of **liked” occupations on the vocational interest 
list was particularly marked in the improved patients. Their analysis 
also suggested that certain patients may be harmed insofar as test 
performance after treatment is concerned. McNeel, Dowan, Myers, 
Proctor, and Goodwin (91) used a large battery of psychometric tests 
before, during, and after insulin therapy and concluded that a fair 
correlation existed between clinical psychiatric rating of post-treatment 
status and the psychometric rating. Luborsky (88), too, gave a battery 
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of 22 psychometric tests to 12 patients before, during, and after treat- 
ment by electroshock. He concluded that the profiles of the schizo- 
phrenic patients showed a general decrease in test scores from the line 
of zero change in the before-during comparison. This decrease was not 
as marked in the before-after contrast. The profiles of the depressive 
patients showed large score increases in the before-during comparison 
and further increase in the before-after contrast. 

Sherman, Mergener, and Levitan (128) employed several memory 
tests consisting of designs, directions, and a reading paragraph and 
found a slight improvement in all tests after the conclusion of electro- 
shock treatment. 

The tests of Zucker and Herbert were used by del Pino (27) with 
schizophrenic patients during cardiazol treatment. He described im- 
provement as being reflected in quicker appearance of images, easier 
representation of scenes, and in the absence of the deviation or replace- 
ment by other images such as occurred before treatment. 

O’Connell and Penrose (101) studied the psychomotor efficiency in 
30 patients treated with metrazol, quantifying the reaction time to an 
auditory stimulus, the tapping rate, and the strength of grip, and they 
concluded that in patients who showed before treatment marked in- 
competence in the tests, whether due to stupor or to agitation, improve- 
ment was greatest. They also found that the increase in the tapping 
speed was greatest in the early phase of treatment and that the amount 
of improvement decreased steadily as the number of convulsions in- 
creased. Subconvulsive doses of metrazol did not contribute appreciably 
to improvement in tapping scores. 

Solomon, Darrow, and Blaurock (134) measured the blood pressure 
and electrodermal response during standardized interview situations 
before and after insulin and metrazol therapy and found, following 
pharmacologic treatment with clinical recovery, that in two groups 
of patients (one group showing pre-treatment psychological resistance 
to the examiner or to the test-situation and the other evidencing re- 
sistance involving perplexity in discussing or the evading of emotional 
problems, and all of whom gave large blood pressure reactions and 
small galvanic responses Ito several ideational stimuli in the interview), 
the autonomic responses were altered to show an unchanged blood 
pressure and larger palmar galvanic changes. Patients^whose pre- 
treatment resistant attitudes persisted and who were unchanged by 
treatment revealed no change in the characterjof theirj£autonomic 
responses. Patients who were overtly passive and cooperative both 
before and after treatment showed both blood pressure and galvanic 
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responses of large amplitude before treatment and these responses 
remained unchanged after shock therapy whether or not recovery took 
place. They concluded that improvement or recovery involves a 
decrease in inhibitory effects arid favors increased sympathetic reactions 
in combination with and partially balanced by greater cholinergic 
response to ideational stimuli. 

Interpretations OF the Psychosis-modifying Effects 

Non-analytic Theory. The lack of enthusiasm in contemporary 
psychiatry for the various systems of psychologic description is amply 
confirmed by the occurrence of an almost complete adherence to psycho- 
analytic concepts in the reports of psychiatrists describing the effects 
of shock therapy upon psychotic behavior. Not all of these psychiatrists 
are psychoanalysts in the formal sense, and, hence, some of the writing 
is hesitant and tentative in the use of analytic constructs. However, 
the psychoanalysts, themselves, have been generally very keen ob- 
servers of what happens to the shock-treated personality, but their 
descriptions are so admixed with analytic interpretations that in many 
cases the facts of behavior cannot be distinguished from opinions about 
the facts. 

Nevertheless, various non-psychoanalytic interpretations of the 
process of symptom-remission following shock procedures have been 
put forth, Otto Potzl, who made possible the easy introduction into 
psychiatry of Sakel’s treatment, believed that the symptomatic disap- 
pearance of the psychosis was due to the disturbances of consciousness 
and to the associated impairment of memory. Myerson (99) also at- 
tributes recovery to the memory difficulty. He thinks that the “mecha- 
nism of improvement and recovery seems to be to knock out the brain 
and reduce the higher activities, to impair the memory, and thus the 
newer acquisition of mind, namely the pathological state, is forgotten. 
As the brain recovers, the well-established trends — those which are 
relatively normal — come back, but the incubus of more recent evolution 
and with less roots — so to speak — of thinking, feeling, and doing, 
remains away." 

The neurological descriptions of the interruption of “pathways" 
and of either the creation or abolition of disturbances in cortical neural 
“connections" and of the readjustment of the neuronal “shortcircuit- 
ing" supposedly associated with the abnormal behavior constitute some 
of the neurologic attempts to explain the remissions from psychosis 
observed following shock therapy. 

Referring to the work of Lowenbach and the writer, Fulton (43) 
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considers electroshock treatment as a functional ablation of the frontal 
areas and links this kind of therapy with psychosurgery as, indeed, 
Freeman had done earlier. The psychological interpretation of the 
efficacy of electric shock treatment from this point of view is that this 
functional ablation of the frontal cortical areas removes some of the 
influence of these areas from active participation in behavior and thus 
psychologic projection into the future with the attendant anxiety is no 
longer easily possible for the subject. 

Fear of the convulsive experience, particularly of the convulsions 
evoked by agents other than electricity, has been considered as an 
important motivation for the patient to control psychotic behavior in 
favor of more adequate reality- testing. 

Psychoanalytic Theory. The psychoanalysts have generally stressed 
the punishment aspects of shock treatment, holding that convulsions 
may discharge energy derived from inwardly-directed destructive 
tendencies, particularly manifested in the depressed states, so that 
either the expression and relatively guiltless acceptance of hostility may 
then be possible for the patient or so that a relative increase in the 
strength of the life-instincts may occur. Weigert (149), for example, 
writes that the ‘‘cruelty of the super-ego is replaced by a sadistic attack 
on the part of reality*' and so, by discharging the self-destructive urges 
from the inner conflicts, reality becomes again the object of libidinal 
cathexis. 

Then, too, the primitive transference situation many times estab- , 
lished following convulsions may give reassurance to the patient, as 
Wilson (155) indicates, that the good and beloved parent figures are 
real and that the hate and death wishes regressively generated around 
the primal negative super-ego images are not triumphant over the love- 
giving representations. Bychowski (20) also observes that the patho- 
logic ego which has been distorted by the infantile desires of the id is 
partly destroyed or weakened by treatment. The resistance against 
reality is therefore decreased and the readiness for a positive transfer- 
ence is increased. 

Abse (1, 2, 3) believes that the treatment situation evokes objective 
anxiety and that this tlfreat of external danger orients the ego to the 
current reality-situation and results in an anxiety-motivated repression 
of the psychotic content so that, in schizophrenia at least, the patient 
is no longer “at the mercy of his complexes." The reestablishment of 
the functional supremacy of the ego-complex is thus achieved as a result 
of these repeated danger situations and the concomitant heightening 
of somatic sensation involved in the convulsive experience. In an 
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article by Schilder (126) the euphoria commonly seen following shock 
treatment has been ascribed to the overcoming of the death threat of 
the convulsion with the consequent feeling of a hypomanic joy of 
rebirth. 

Almost all analysts are agreed that the aims of psychoanalytic 
therapy are in direct opposition to the apparent results achieved by 
shock therapy. The aim of psychoanalysis is seen either as the mitiga- 
tion, and not the mere replacement in kind or even augmentation, of 
the cruelly primitive super-ego so that the patient may be able to bear 
the necessary burdens of reality, or psychoanalysis is directed to the 
weakening of repression and not to the reinforcement of the repressive 
mechanism of defense which the analysts see as the result of convulsion 
therapy. 


Animal Experimentation 

The experimental quest for an adequate theory which would make 
rational the empiricism of convulsive therapy as used in psychiatric 
treatment is probably best oriented around human subjects. Certainly 
there is now no lack of a human experimental population with which to 
work. Nevertheless, psychologists and other workers interested in the 
rationale of convulsive treatment will want to study by animal as well 
as by human experimentation the complex of psychologic happening 
which is included in the effects of artificially induced convulsions. The 
results of many such controlled investigations are in existence concern- 
ing the behavior of animals following convulsions induced by various 
procedures. Most of these studies refer primarily to the experimental 
effects upon learning and upon the determinants and cognitive struc- 
turing of habits. 

In a study of the influence of metrazol convulsions upon maze- 
learning, Bunch and Mueller (19) found no significant differential 
effect upon the learning by rats of a 14 unit multiple T-maze. Heron 
and Carlson (63) substantially confirmed this observation, although 
Loken (85) indicated that both maze time and errors may be increased 
following these pharmacologically induced seizures. Stainbrook (136) 
found that the relearning maze time was greatly lengthened following a 
long series of electroshock convulsions although the relearning errors 
were no greater than in a control group. The differentiating ability of 
the dog in a conditioning situation is reported to be disturbed im- 
mediately following the administration of metrazol, but, according to 
Rosen and Gantt (123), discriminatory efficiency tends to return with 
recovery from the effects of the convulsion. Kessler and Gellhorn (78) 
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have shown that convulsions reactivate in the rat a conditioned response 
previously inhibited by non-reinforcement. Rose, Tainton-Pottberg, 
and Anderson (121) administered a series of insulin shocks to a well- 
trained sheep in which a conditioned reflex had been standardized in 
tests extending over a period of seven years. They found that, following 
the hypoglycemic treatment, the conditioned reflex which had been 
almost entirely absent for one year reappeared with abnormal vigor. 
Interval leg movements, associated ordinarily with experimental 
neurotic behavior, appeared for the first time in the history of the 
animal and became a relatively permanent part of the aninriars reaction 
to the experimental situation. Page (104) and Stainbrook (135) have 
also observed signs of abnormal behavior in animals following electro- 
shock convulsions, and Stainbrook and de Jong (138) saw reactions 
of the rat indistinguishable from the state of ^‘experimental catatonia*’ 
which ensued immediately after the cessation of an electroshock con- 
vulsion. 

Riess and Berman (117), using a relatively complex and a simpler 
maze, concluded that insulin shock has a greater disintegrating effect 
on a less well-fixed habit than upon one of greater fixation, that the 
disintegrative effect is greater on the learning retention of a longer and 
more difficult maze than on a shorter and easier maze, and that hypo- 
glycemic treatment has a disintegrating effect upon the initial acquisi- 
tion of a maze habit. Siegel (129), however, concludes that “the retention 
of a ‘barely learned* simple response in the rat is not affected by a series 
of electroshock convulsions,” and he similarly reports that the ability 
to learn a simple habit is also not affected by such convulsions. Stain- 
brook and Lowenbach (139) have indicated that in a simple water-maze 
a long series of either electroshock or noise-fright* convulsions does not 
alter the maze behavior of the rat insofar as error scores are concerned; 
time scores, however, are significantly changed. 

All experimenters with convulsive technics who use animals in 
controlled learning situations must take into account the change in 
the behavior of the animal after it has been “shocked” a few times. The 
increase in maze time following such procedures is very probably a 
reflection of this behavior* change. The conclusions of Riess and Ber- 
man, particularly, being one of the few reports showing a permanent 
shock-destructive effect upon already learned maze habits, must be 
weighed by this consideration, especially since they saw that insulin 
had an unfavorable effect upon the initial acquisition of the maze habit 

* Neise-fright is used to describe the reactions in the rat which are called audiogenic^ 
audioepilepUc, or abnormal behavior by various other investigators. 
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as well as upon learning retention. The change in maze behavior re- 
ported in the other situations measuring learning retention may have 
been due primarily to this general behavior change and not to a direct 
‘‘shock'' effect upon the habit system. Certainly, in a water-maze 
with constant escape motivation, it is impossible to significantly alter 
a well-trained maze habit, insofar as error scores are concerned, beyond 
an immediate post-convulsive period of shock effect varying with the 
spacing and number of convulsions given. 

Obviously, in all these studies on the behavior of animals after 
convulsive treatment, both time and error scores should be used in 
quantifying the results and a description of the total behavior of the 
animals with a few simple operational specifications of such behavior 
should be included. 

The field of animal experimentation either with reference to the 
effects of shock upon habits in the elaboration of a theory of shock 
therapy or in relation to the use of convulsions in experiments designed 
to test various postulates of the theories of learning is currently quite 
open, and experiments of this nature should be extraordinarily fruitful 
for the pursuit of both objectives. 
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THE USE OF THE WECHSLER-BELLEVUE SCALES: 

A SUPPLEMENT 

ROBERT I. WATSON 
Carnegie Institute of Technology 

In the July, 1945 issue of the Psychological Bulletin, Rabin (17) 
described in succinct fashion certain of the findings that have come 
about with the use of the Wechsler-Bellevue Scales with normal and 
abnormal persons. He stated that the review was “ an attempt 
to coordinate and summarize the findings to date ...” (17, p. 410). 
On examination of the literature through 1944 it appears that fifteen 
research studies and clinical applications have not been discussed. One 
of these studies appeared in 1941, two in 1942, five in 1943, and seven in 
1944. The purpose of this note is to supplement and to amplify the 
material given in the previous article.* The material is organized fol- 
lowing Rabin’s plan of presentation with the addition of two sections, 
one on use in the Armed Services, and the other on the rationale of the 
various subtests. Assuming accuracy of report, specific studies pre- 
viously presented will not again be examined with the exception of 
certain studies of Lewinski about which confusion might otherwise 
exist. 

Comparison with Other Tests 

Five additional studies which present comparisons between the 
Wechsler-Bellevue Scales and other measures appeared prior to 1945. 
The relationship between the Herring Revision of the Binet-Simon 
Tests and the Verbal Scale was examined in two studies by Lewinski 
(10, 14), who used as subjects 100 recruits referred to a Naval clinic 
because of suspected mental retardation. Scale A and Scale B of the 
Herring Revision showed correlations of .65 and .64 respectively with 
the Verbal Scale notwithstanding a limited intellectual range. For the 
two Herring scales the difference in IQ points from the Verbal Scale 
was less than 5, in 55 and 40 per cent respectively and more than 15, 
in 5 and 12 per cent. The four mean IQ’s were in the neighborhood of 
seventy, with the Wechslcr IQ’s higher in 58 and 76 per cent of the 
cases. Goldfarb (5) administered the Revised Stanford-Binet, Form 
L, to 60 superior foster home children between the ages of 11 and 
17 and found the IQ correlation to be .86, .80, and .67 with the full, 
Verbal, and Performance Scales respectively. In 70 per cent of the 
cases the Stanford IQ was higher than the Wechsler-Bellevue IQ with 

* The bibliography is limited to studies appearing in 1944 or earlier years. 
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the mean of the former being almost S points higher than that of the IQ 
on the full Wechsler-Bellevue Scale. Goldfarb concluded that the low 
scores on the Wechsler-Bellevue make the test relatively ineffective in 
discriminating among a group of superior adolescents. A further study 
comparing Stanford-Binet and Wechsler-Bellevue scores of negro and 
white criminals by Maizlish (16) has as yet appeared only in abstract 
form. Incidental to a comparison of narcotic addicts and matched 
hospital attendants, Brown and Partington (2) calculated for the entire 
group of eighty-four the correlations between the Wechsler-Bellevue 
and other tests. Among these were representative correlations of .66 
with the USPHS Number Series Completion Test, .63 with the USPHS 
Paper Form Boards, .60 with the USPHS Maze Test, .59 with the 
Ferguson Formboards, .48 with the Knox Cubes, and .41 with Healy 
Picture Completion Test II. 

Rabin, in his subsection on clinical status (17, p. 412), confuses two 
studies by Lewinski. The Lewinski study cited (12) is concerned with 
a comparison of the Kent Oral Emergency Test and the Verbal Scale. 
This study is summarized in Rabin’s subsection on correlation with 
other tests (17, p. 411). Contrary to Rabin’s statement, this article 
contains no mention of performance on individual subtests. The 
reference not cited is a study of the variability of the subtests of the 
Verbal Scale (11) to be discussed later. Rabin’s quotation in regard to 
Digit Span (17, p. 412) is found in this latter study. Still another study 
by Lewinski (13) omitted by Rabin is relevant to the problem of clinical 
status. The Wechsler Verbal Scale was given to 451 Naval recruits 
referred to a ncuropsychiatric clinic for an examination to determine 
their fitness for duty. The recruits’ scores were grouped on the basis 
of Verbal Scale IQ into categories of Normal, Dull Normal, Borderline, 
and Mentally Deficient. Critical ratios between the mean scores of 
these groups for each individual sub test were calculated, and all found 
to show statistically significant differences between immediately ad- 
jacent groups. It will be noted that this study went beyond that of 
Wechsler, Israel and Balinsky (22) in that it investigated the dis- 
criminatory value of the verbal subtests at the four levels of intelligence 
mentioned. 

In resume, it would appear that fairly high correlations are found 
between the Wechsler-Bellevue Scales and verbal measures of intelli- 
gence but that the correlations with performance type scales are some- 
what lower, although still substantial. The trend reported by Rabin of 
relativelv higher Wechsler-Bellevue IQ’s for duller subjects and rela- 
tively lower ones for brighter subjects is substantiated. Adequate dis- 
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crimination of each of the verbal subtests at different intellectual levels 
is obtained. 


Test Results and Characteristic 
Patterns of Special Groups 

The studies reported in this section are concerned with the varying 
scores obtained on the subtests of the Wechsler-Bellevue Scale in 
various personality and behavior disorders. As Rapaport ei al. point 
out, “one cannot expect every case within the same nosological group 
to have identical test-patterns; but one can expect that in any nosologi- 
cal group, a large share of cases will have, as a reflection of the disorder, 
similar impairments of certain intellectual functions, and these impair- 
ments will become evident in the intelligence test scores’* (19, p. 23). 

The intertest variability of the scores of 158 mentally deficient and 
189 borderline defective Naval recruits on the Verbal Scale was reported 
by Lewinski (11). In the mentally defective group the mean score on 
the Arithmetic subtest was significantly lower and Comprehension and 
Similarities significantly higher than the other subtests. The com- 
paratively low score on Arithmetic corroborates the findings of Magaret 
and Wright (IS) and Wechsler et ah (22). Cleveland and Dysinger (3) 
studied the Wechsler-Bellevue scores of 20 institutionalized senile 
patients with a mean age of 75.1 years. Many of the subjects made low 
or zero scores on the Performance subtests, which confirms the finding 
reported by Rabin (17) of a more consistent trend to decline in Per- 
formance subtest scores with age. Their results on the individual 
subtests disagree with those of Wechsler (21) since they found results 
contrary to his claim that Similarities did not “hold up” with advanced 
age, and that Object Assembly did do so. Unfortunately, only an 
abstract is as yet available of the study of Van Vorst (20) who used as 
subjects a relatively small number of delinquent boys who had been 
diagnosed as psychopathic personalities. He states that the results 
“ ... do not appear to support the claims ... to an extent which would 
justify defining any characteristic response pattern for the psychopathic 
personality.” 

The most extensive rfnd detailed study of scatter of Wechsler- 
Bellevue scores yet to appear is that of Rapaport and his collaborators 
(19). Their nosological groups included paranoid and unclassified 
schizophrenics, both further subdivided into acute, chronic and dete- 
riorated cases; simple schizophrenics; paranoid conditions; preschizo- 
phrenics divided into coarctated and over-ideational; depressives 
further subdivided into psychotic, involutional, severe neurotics and 
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neurotics; hysterics; anxious and depressed neurotics; mixed neurotics: 
obsessive compulsives and neurasthenics. In addition they used 
a control group further subdivided into smaller groups on the basis 
of trends in the direction of maladjustment exhibited. Although 
including a study of mean scatter and of the presence of extreme- 
ly high or low weighted scores, principal attention was focussed 
upon the deviation of the scores from the level of the vocabulary score. 
Extremely detailed findings on each of the nosological groups prevent 
any adequate short description. A quotation from one of their sum- 
marizations will illustrate both their findings and their rationale of 
these findings. 

The deteriorated unclassified schizophrenics show extreme impairment on 
almost all of the subtests. Comprehension and Digit Span are greatly impaired; 
Arithmetic is even worse; and worst of all are Picture Arrangement, Picture 
Completion, and Digit Symbol. In striking contrast, they show relative ef- 
ficiency on Information, Object Assembly, and, to a lesser extent. Similarities 
and Block Design. In general, the scatter of the Deteriorated group may serve 
as an exaggerated representation of the schizophrenic pattern: impairment of 
judgment (Comprehension), concentration (Arithmetic, Picture Completion, 
Digit Symbol), and planning ability and anticipation (Picture Arrangement). 

The Rationale of the Subtests 

Very little attention has been paid to the psychological functions 
tapped by the subtests except in the formulations of Wechsler (21) and 
Rapaport el al. (19). The latter disagree with Wechsler on the bipartite 
classification into verbal and performance subtests. The basis for their 
difference of opinion is clinical experience, theoretical considerations 
and statistical findings, although the latter is limited to inspection of 
mean scores and standard deviations on each of the subtests for 261 
clinical and control cases. They conclude that both the Verbal and 
Performance Scales should be subdivided into two parts, all four 
“groups” differing one from the other in mean score and standard 
deviation. Vocabulary, Information, Similarities, and Comprehension 
are designated as the Essentially Verbal group since their common 
feature is the requirement of verbal presentation and response. The 
remaining two subtests of this scale. Digit Span and Arithmetic, are 
referred to as the Attention and Concentration group because of their 
presumed relationship to these functions. Verbalization is considered 
merely to be the form of communication. The Visual-motor Coordina- 
tion group is so characterized because performance on the subtests. 
Object Assembly, Block Design, and Digit Symbol is said to depend 
upon coordinated visual-motor activity. The Visual Organization 
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group, Picture Arrangement and Picture Completion, requires no es- 
sential motor activity, but instead demands this function. Each subtest 
is also considered as calling into play certain functions or set of func- 
tions. For example, Digit Span is considered primarily a measure of 
attention, Similarities a measure of concept formation, and Arithmetic 
a measure of concentration. In many of their formulations they differ 
sharply with Wechsler (21) who considers, for example, that Digit Span 
is a measure of memory as well as attention, and that Arithmetic is a 
measure of reasoning. 

Use in the Armed Services 

The Wechsler-Bellevue Scale has been used extensively in the 
Armed Services (4, 7, 8, 10, 11, 12, 13, 14). In fact, as Hunt el al, say 
in reference to the detection of feeblemindedness it “ . . . has been 
adopted as the standard test for such use in the Navy” (8, p. 478). 
Both Hunt el al, (8), and Hildreth el al, (7) describing the psychometric 
procedures used at Naval Training Centers, give considerable promi- 
nence to the Wechsler-Bellevue Scale. As early as 1941, Rapaport (18) 
suggested a possible use of items from the Wechsler-Bellevue in examin- 
ing selective service registrants suspected of feeblemindedness. In three 
case histories with implications in regard to military selection, Knight 
el al, (9) described the use of Wechsler-Bellevue findings along with 
those from a variety of other measures. The Army has in recent years 
used a revision and extension of the Wechsler-Bellevue Scales, the 
Wechsler Mental Ability Scale (23). As described in Greenwood el al, 
(6), it consists of sixteen subtests, seven verbal and nine non-verbal 
or performance tests. It is so standardized that a selection of a lesser 
number of subtests appropriate to the situation may be made. Pre- 
sumably in the immediate future we may expect to sec many research 
studies appearing which use this presently restricted measure. 

Summary and Suggestions 

That the results of studies reported here would in some measure 
coincide with the results of those previously summarized is hardly 
surprising. A summarization showing considerable agreement with that 
of Rabin is natural, not to say inevitable. Repetition is obviously 
unnecessary. Therefore, a rounded summary is not attempted. Instead 
certain summary statements or cautions unreported previously will be 
stressed, based upon studies in both reviews. There are, however, 
certain points of disagreement of emphasis which should be mentioned. 

As part of the first sentence of his summary Rabin states, ”... the 
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Wechsler-Bellevue Scales . . . have supplanted some time-honored 
diagnostic tools” (17, p. 419). This rather broad and undocumented 
statement is open to question. It would perhaps be more correct to say 
that in the testing of adults they have supplemented other diagnostic 
devices such as the Stanford and Revised Stanford-Binet, the Babcock 
Test, and the Arthur Performance Scale, not to mention the whole 
array of less directly comparable measures which modern clinical pro- 
cedure affords. In this connection it might appear to many that the 
dismissal of age-level intelligence tests as “hotch-potch scales” (17, 
p. 413) is too cavalier a treatment of such measures as the Stanford and 
the Revised Stanford-Binet Tests. He goes on to say in regard to the 
Wechsler-Bellevue ”... that it tends to differentiate better than other 
measures between the dull and feebleminded” (17, p. 419). It may be 
this will prove to be the case; as yet the evidence on the matter is 
hardly conclusive. For example, the one study described by him on the 
relative merits of Stanford-Binet and the Wechsler-Bellevue Scale in 
clinical effectiveness as an aid in the diagnosis of mental deficiency was 
performed at the Bellevue Hospital (1). Although tempered by the 
social and psychiatric data, there was a possible predisposition on the 
part of the psychologists to urge, and the psychiatrists to accept, as 
more valid the findings of a scale developed at their own hospital. For 
a second group in the same study this factor could not operate since the 
Wechsler-Bellevue was given to the subjects two or three years after 
the recommendation for commitment or non-commitment was made. 
For the 36 cases on which both measures were available the biserial 
correlation coefficient for the Wechsler-Bellevue was .720 ±.086 while 
that for the Stanford-Binet was .611 ±.103. Presumably the difference 
between these correlations is statistically non-significant. 

The differences of opinion in regard to rationale between Wechsler 
and Rapaport are both encouraging and disconcerting. Encouraging, 
because they show that attention is being given to the meaning of the 
subtest scores obtained; and disconcerting, because they show that 
much remains to be done before a thoroughgoing rationale for the 
various subtest scores is established. Some might find the emphasis on 
a pattern of “function” a retrogressive trend in clinical psychology 
smacking too much of a faculty concept of mental organization. Further 
experience, clinical, experimental and statistical, will eventually allow 
more final judgments to be drawn. Studies showing whether scores on 
the subtests are correlated with successful performance in other situa- 
tions which require the same functions seem indicated. Factor studies 
are also applicable. There is no doubt, even today, that individual 
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clinical use seems to become more meaningful upon application of some 
conception of rationale. Enough evidence exists to show that a pattern 
of variations in scatter on the Wechsler-Bellevue Scale, in part, is 
produced by a pattern of psychopathy, but it must be remembered 
that even if the subtests were pure “functional entities," differing 
educational backgrounds, and cultural environments, unrelated to 
psychopathy would tend to blur the findings making the patterns in- 
dicative rather than invariably diagnostic. The Wechsler-Bellevue 
Scale is a valuable clinical instrument but it does not allow the substitu- 
tion of a list of diagnostic signs for clinical acumen. 
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PSYCHOLOGY FOR THE ARMED SERVICES 


A SPECIAL REVIEW* 

JOHN F. DASHIELL 
University of North Carolina 

This is a third book published under the sponsorship of the Em- 
ergency Committee of the National Research Council, following as it 
does the highly-successful Psychology for the Fighting Man and Psychol- 
ogy for the Returning Serviceman. The original manuscript was drafted 
by the summer of 1944 under the general editorship of Boring, and was 
revised in the course of the fall and winter by other psychologists and 
by military experts, including Colonel Joseph I. Greene and Colonel 
Edward L. Munson, as well as by the Science Service expert. Miss 
Marjorie Van de Water. 

The present book is somewhat more factual and less advice-giving 
than the Psychology for the Fighting Man and considerably so than the 
Psychology for the Returning Serviceman. It “is intended as a textbook 
written on the college level, but also as a book in which the military and 
naval applications of psychological principles and the basic principles 
themselves are more fully developed than in the earlier book. It was 
believed that a single book might be equally useful as a textbook and 
handbook of psychology for general use by members of the Armed 
Services, not simply for instruction but also for individual reading and 
reference’* (p. ix). In organization this book follows that of Psychology 
for the Fighting Man, almost chapter by chapter. The first ten chapters, 
largely on sensory functions of military value, present practically the 
same contents, with here and there some more details and often a 
rephrasing of passages to decrease their newspaper and increase their 
textbook character, without sacrifice in clarity or readability. The 
same is true for some of the remaining of the twenty-four chapters. Of 
those chapters receiving much fuller or even reorganized treatment, one 
should mention especially those on Selection of Men (chiefly by Bingham 
and Harrell), on Learning (Seidenfcld), on Personal Adjustment (Hunt, 
Abrams, Doll, and Seidenfcld), on Assessing Opinion and Discovering 
Facts (Edwards); and there is an added chapter on Differences Among 
the Peoples of the World (Klinebcrg, Child, and Canady). 

Certain questions naturally arise. The first that psychologists ask 
is: Is the material sound, scientifically and professionally? A few hasty 
assertions may be found in it. Examples: “It is doubtful that the 

* Committee of the National Research Council. Psychology for the armed serv- 
ices. Washington: The Infantry Journal, 1945. Pp. 519. 
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muscles outside the eyeballs, the muscles that move the eyes, very often 
cause fatigue” (p. 33), “An after-dinner cup of coffee never kept any 
man awake because of the caffeine in it” (p. 222), and, “There is no 
evidence that the sexual act in animals is learned” (p. 398) ; but such 
assertions are almost inevitable in a book pitched at the level of this 
one — and these in particular are perhaps not to be disproven as yet 
by references to experimental data. Thpn there are certain misleading 
manners of statement that are likely to falsify the reader’s understand- 
ing of psychological phenomena. Examples: “The brain of the man who 
owns the retina sorts the colors out into objects” (p. 74), “Attention 
picks out one set of impressions and puts them together into one 
object” (p. 75), or, “It was discovered that they [rats] have mental 
maps” (p. 157). But over against such concessions to the non-technical 
reader there are warnings against too hasty conclusions, as in the 
sentence, “The experimental results on this matter [of heavy smokers 
and their supposed need for more rapid breathing to get their needed 
amount of oxygen] are not, however, conclusive” (p. 219). And it can 
certainly be said that for factual soundness. Psychology for the Armed 
Services does not disappoint the expectations originally engendered by 
inspection of the reassuring list of collaborators. 

The book before us is explicitly intended to serve as a textbook. Is 
it, then, teachable? That the intent is realizable is obvious enough. 
Each chapter and topic is organized to high degrees, with frequent 
employment of centerheads and sideheads that suggest didactic 
“points” for the student to set down and for the instructor to use as a 
skeleton for his presentations. Transitions from chapter to chapter are 
smooth enough. Not least important are the fairly generous illustrations 
and even a few simple tables, of wide range, whether taken from techni- 
cal psychological journals or from supplies of military photographs — 
all of these being admirably apposite, and introduced to inform more 
than to entertain. The references appended to each chapter are well- 
selected and should be of real help to teacher and student. 

Of recent years scientists have become more aware than formerly 
of the great gap between technical knowledge for the scientist and 
popular knowledge for the layman, and the urgency of building bridges 
between the two. To select those developments in scientific knowledge 
that are most relevant to human problems, to reorganize their state- 
ment into forms both understandable and interesting to the layman, 
and to make these cheaply available, is no mean achievement. And in 
the present book, together with Psychology for the Fighting Man^ 
psychologists have before them effective examples of how the job can 
be done, and at two levels. 
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As in the Psychology for the Fighting Man, over a fourth of this book 
is devoted to the study of sensory and sense-perceptual processes. It 
seems not unlikely that that is fair reflection of the important role that 
seeing, hearing, smelling, and equilibration play in the combat ac- 
tivities of servicemen. But that it should be taken as pointing an error 
in the allotting of less space to those functions in textbooks of more 
general psychology, need not follow, as some may suppose (2). Indeed, 
another recent book on military psychology (3) allots only about 10 
per cent to them. 

Psychology for the Armed Services is easily distinguished from two 
other recent books by American psychologists for use in military in- 
struction. Its scope is much more inclusive than is that of the Penning- 
ton, Hough, and Case book (4); and by the same token it is addressed 
to a wider audience, including both commissioned and enlisted men. 
On the other hand, its avenue of approach is distinctly different from 
that of Meier’s (3) in that psychological rubrics determine its division 
into chapters, whereas discussions in the latter text take off from 
military problems and phases. The present work could be used in 
combination with either or both of the others mentioned. 

The cooperative sponsoring of this book and its predecessors, both 
in the writing and in the distribution, by military and by scientific 
authorities, is a hopeful augury for applied psychology. 

As a bookmaking job — paper stock, type, format, and binding — 
Psychology for the Armed Services is excellent. 
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THE EFFECTS OF SCHOOLING UPON IQ 
A NOTE ON LORGE’S ARTICLE 

HENRY E. GARRETT 
Columbia University 

In a recent publication,^ Dr. Irving Lorge has presented data which 
have been widely hailed as demonstrating the thesis that “education 
increases the IQ.” Such an interpretation of his study is entirely fal- 
lacious, and it would be most unfortunate if it were generally accepted. 
Lorge has, indeed, shown what psychologists have long known, namely, 
that group intelligence test scores increase with the amount of educa- 
tion. Moreover, he has been able to demonstrate from his data that 
when boys are roughly matched for mental ability at age 14, subsequent 
education is effective in increasing the group intelligence test scores of 
the better trained. 

I have no quarrel with this last conclusion, although I think the 
effect of education is somewhat less than Lorge represents it to be. I 
do strongly object, however, to his use — or misuse — of the term “IQ.” 
The uncritical impression gained from Lorge's paper, namely, that 
“education raises the IQ,” has been aired in the newspapers, and is 
already being parrot ted by those who would like to believe that the 
school book is more powerful than original nature. Unfortunately, the 
correction of a wrong notion often requires the presentation of technical 
detail which is, in general, carefully avoided by the loquacious but un- 
trained reader. As a result, a popular fallacy, once established, is ex- 
tremely hard to eradicate. The present instance ofiFers a good illustra- 
tion of this fact, and is another reason why the scientific writer should 
be extremely careful what he puts into print. 

There are two conclusions to be drawn from Lorge's report — the 
one true, the other false. I shall state these conclusions in order and 
summarize briefly the evidence for each. 

Conclusion I; The more recent and the more extensive a person's educa- 
tion^ the better he is likely to perform on tests requiring readings information^ 
vocabulary ^ arithmetic — t.c., on tests involving words and numbers. 

This general result has been known for a long time; and any other 
finding would certainly seem quite unlikely. I shall cite evidence from 
only one source: — the extensive data furnished by the Army Alpha 
test, a verbal examination administered to nearly two million men 
in 1917-18. The correlation^ between extent of schooling and Alpha 

* Lorm; Irving, Schooling Makes a Difference, Teachers College Record, 46: 1945. 
483-92. 

^ Yerkes, R. M., Ed., Memoirs National Acad, of Sciences, 1921, 15, 748. 
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score in a group of 48,102 recruits was .74. Among white enlisted men, 
scores in Alpha increased steadily with education: for men who had 
completed S-8 grades, the mean was 51.1 ; for men who had completed 
high school, 92.1; for men who had completed college, 117.8. Scores 
made by officers declined steadily with age (recency of schooling); 
clerks (workers with words and numbers) scored higher than machinists 
(workers with tools) ; chaplains higher than dentists. 

The evidence for the influence of education upon verbal intelligence 
tests seems, then, to be clear and convincing. Lest the educator become 
too complacent, however, I mention in passing that officers who re- 
ported schooling of 4 grades or less scored 20 points higher on Alpha 
(112.5) than white enlisted men who had completed high school (92.1). 
Perhaps there is food for thought in the suggestion that even a high 
school education is not always a substitute for brains in the first in- 
stance. 

Lorge’s data warrant special consideration since he has studied the 
effects of schooling upon test scores of men who 20 years earlier were 
equated (at least roughly) in group intelligence test score. It will be 
helpful to review briefly Lorge’s experimental material in order to 
provide a proper setting for subsequent discussion. In 1921, 863 boys 
around 14 years of age were given a series of abstract intelligence, 
mechanical aptitude and clerical ability tests. In 1941 — 20 years later 
— 131 of the original 863 were located and induced to take the 0/xs 5c//- 
Administering Test, Higher Examination, Form B (time limit: 20 
minutes), and Part III of the Thorndike Intelligence Examination for 
High School Graduates, Form V (time limit: one hour). According to 
Lorge, this sample of 131 was not significantly different in mean achieve- 
ment or in variability from the original ^863, and hence was fairly 
representative of the larger group. The educational achievement 
(highest grade completed) of these 131 survivors varied from 8 grades 
to 17 or more (beyond college), and the opportunity is afforded, there- 
fore, to discover how differences in extent of schooling affected the 
1941 scores of boys equated for test score in 1921. The 131 ‘‘boys’* — 
34 years old in 1941 — ^were classified into six groups on the basis of their 
1921 group intelligence test scores; and 1941 scores on Otis SA were 
tabulated for various levels of academic achievement. 

Table I reproduces Lorge’s Table I, and Table II is my summary 
of certain of these data. Consider the scores in Table I, column 89-98, 
the data especially cited by Lorge, A total of 23 boys out of 131 scored 
from .89 to 98 on the 1921 test. Of these 23 the 2 who completed grade 
8 had a mean score of 39.0 on the 1941 Otis test; the 2 who completed 
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TABLE !♦ 

Average Scores on the Otis Self-Administering Test of Mental Ability, 
Higher Examination, Form B (20-minute time limit), Taken in May 1941, Classi- 
fied BY Score on the 1921-22 Test of Abstract Intelligence and by Highest 
Grade Completed by May 1941: for 131 Boys Who May Be Considered a Repre- 
sentative Sample of the Vocational Guidance Grade Group 


Score on 1921-22 Test of Abstract Intelligence 


Highest 

49-58 

59-68 69-78 

79-88 

89-98 

99-114 

Grade Com- 

Otis 


Otis 

Otis 


Otis 


Otis 


Otis 


pleted by 1941 Score 

N 

Score 

N Score 

N 

Score 

N 

Score 

N 

Score 

N 

8 

14.0 

4 

22.0 

4 20.7 

9 

26.4 

5 

39.0 

2 

33.0 

1 

9 

19.0 

1 

19.5 

2 14.5 

2 

31.1 

8 

38.0 

2 

29.0 

1 

10 

24.0 

1 

22.0 

4 25.1 

9 

28.5 

8 

37.0 

4 

46.5 

2 

11 or 12 

21.0 

1 

26.0 

1 31.7 

3 

31.0 

9 

41.0 

3 

34.0 

1 

13 or 14 



22.0 

1 26.0 

3 

34.7 

4 

41.7 

4 

37.5 

2 

15 or 16 



34.0 

1 27.0 

1 

39.5 

6 

53.5 

2 

50.8 

S 

17 or more 




38.0 

3 

46.0 

5 

54.5 

6 

43.0 

1 





TABLE II 









Summary of Certain Data from Table I 








Intelligence Test Scores: 1921-22 




Highest Grade 

56-68 

69-78 


79-88 


89-98 


99-114 

Completed by 1941 

Otis 


Otis 


Otis 


Otis 


Otis 




Score 

N 

Score N 


Score N 


Score N 

Score 

N 

8- 9-10 


21.5 

10 

22.1 20 


29.0 21 


37.8 

8 

38.8 

4 

13-16 


28.0 

2 









15-17 plu 

s 



35.3 4 


42.5 11 


54.3 

8 

49.5 

6 

Difference: 

6.5 

12 

13.2 24 


13. S 32 


16.5 16 

10.7 

10 


* From Lorge, op, cU., p. 487. 


grade 9 had a mean score of 38.0 on the 1941 test, and so on. The 6 boys 
who completed 17 or more grades (college and beyond) had a mean 
1941 score of 54.5. Averaging the three entries at the top and the two 
entries at the bottom of this column we find (Table II, column 89-98) 
that the 8 boys who completed 8, 9 and 10 grades had a mean 1941 
score of 37.8, while the 8 boys who completed 15 or more grades had an 
average 1941 score of 54.3. These two groups differed by 16.5 points on 
Otis SA in 1941 , and this difference Lorge attributes to the better school- 
ing of the second group. Several comments are in order: 

1. The 16 boys in column 89-98, Table II, were not strictly equated 
in score in 1921. Their scores, in fact, show a 10 point range (89-98), 
while in 1941 the range between the 8 with the best and the 8 with the 
poorest education is 16.5 points. These ranges, to be sure, are not 
directly comparable; the tests are in different units and the second 
range is between two selected means and is restricted as compared with 
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the original range of 10 points. It seems reasonable to infer, however, 
that had the 16 boys been more exactly equated in 1921 score, the gain 
of 16 points might have been somewhat reduced. Evidence for this 
inference can be found in the other columns of Table II: — differences 
in Otis SA score attributable to 7-8 years of schooling are 6.5, 13.2, 
13.5, and 10.7. Lorge, incidentally, gives comparative data only for the 
column in which the change is greatest. 

2. It seems clear that 7-8 years of additional schooling did make a 
difference in the performance of these 8 subjects upon group intelligence 
tests. But the change wrought by education is certainly modest and 
hardly warrants a feeling of smugness on the part of their teachers. 

3. Any conclusion as to the effects of schooling upon the Otis test 
scores of subjects equated for intelligence level 20 years earlier must 
necessarily be qualified by the small sizes of the samples. The 16 boys 
cited above bear a heavy burden; and results from even 131 subjects 
can at best be regarded as suggestive rather than conclusive. 

Conclusion II: ^'Schooling raises the IQ.** There is no evidence in 
Lorge* s report to substantiate this statement. 

We have seen that the 8 boys who scored between 89 and 98 upon 
an intelligence test in 1921, and whose subsequent schooling was 
8-9-10 grades, had an average score of 37.8 on the Otis SA in 1941; 
and that the 8 boys who scored between 89 and 98 upon an intelligence 
test in 1921, and whose subsequent schooling was approximately 4 
years of college, had an average score of 54.3 on the Otis SA in 1941. 
According to tables supplied for the Otis SA examination, a score of 38 
is equivalent to a “Binet IQ” of 103, and a score of 54 to a “Binet IQ” 
of 115. The first score, namely, 38, is also found from the tables to be 
equivalent to a “mental age” of 16-5; the second score of 54 to a “mental 
age” of 18-5. Comparison of these two MA’s and two IQ’s furnishes 
the basis for Lorge’s statement that for boys of “equal” intelligence 
at age 14, 7-8 years of schooling produce an increase by age 34 of “two 
full years” in MA (i.e., 16-5 to 18-5), or a change in IQ from 103 to 
115. 

It is worthwhile examining further this entirely erroneous conclu- 
sion. Psychologists conversant with test construction know that rarely 
if ever is an MA or IQ derived from a group test even remotely com- 
parable to the IQ, achieved on Stanford-Binet. This is recognized by 
Otis himself, who writes® “The term Mental Age (capitalized), however 
has now come to have a special meaning and to denote measures of 
mental ability — i.e., scores — in the Binet-Simon tests. Binet mental 

• Otis, A. S., Manual of Directions^ Otis Self-Administering Tests of Mental Ability^ 
Yonkers, World Book, 1928, p. 4. 
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ages below about 13 years are true mental ages. Above that, especially 
above 16 years, they are merely scores.’" (Italics mine) Again, Otis 
writes;^ * • . . . IQ's as sometimes derived from group tests of mental 
ability bear little relation to IQ's derived from the Binet tests." To be 
sure, Otis has attempted by means of an ingenious chart to provide 
MA’s and IQ's which are estimates of Binet MA’s and IQ’s that subjects 
might have earned when younger. But such estimated MA’s and IQ’s 
are neither psychologically nor statistically equivalent to Binet values, 
and the psychologist who uses these terms without proper explanation 
and qualification — as does Lorge — performs no service other than that 
of confusing and misleading his readers. 

The IQ is a developmental concept, suitable for use with age- 
scales such as Stanford-Binet. The function of the IQ is to balance 
mental status against life age over the period when growth is progress- 
ing. Up to age 14r-15 the IQ is useful and informative: after this period 
the MA no longer increases w^ith CA and the IQ is of no value. To use 
the term IQ with reference to 34 year olds is indefensible and incorrect. 

The IQ should be used only with tests so constructed that MA’s 
derived from test scores will, when divided by CA, return a constant 
ratio. Requirements for a constant IQ are well known® but perhaps may 
be profitably repeated here. In order to yield a constant IQ a mental 
test must (1) exhibit IQ distributions, the SD’s of which are equal at 
each age level (this means that SB’s of MA distributions must increase 
systematically with age): (2) measure the same abilities at successive 
age levels; (3) provide IQ’s which show no consistent tendency either 
to increase or to decrease with CA. 

The requirements for a constant IQ are not met by any group tests 
with which I am familiar. They are met by the Stanford-Binet. The 
term IQ, therefore, should never be used with group tests but should 
be confined strictly to tests of the Binet type. To be sure, some group 
tests® keep the IQ approximately constant by assigning an IQ of 116, 
say, to the score in each age distribution one sigma above the mean; 
an IQ of 132 to the score two SB above the mean, etc. But such statisti- 
cal IQ’s arc not equivalent to Stanford-Binet IQ’s, and when used inter- 
changeably confuse rather than inform. The Otis SA Test does not 
y)rovide MA’s and IQ’s equivalent in meaning to the Binet values. 
Lorge’s use of the terms MA and IQ, therefore, is unwarranted. And 
the implication from his study that 7-8 years of education will raise 
the IQ from 103 to 115 (“two full years’’) is without foundation in fact. 

^ Otis, A. S., op, cit., p. 5. 

® McNemar, Quinn. The revision of the Stanford-Binet Scale, New York; Houghton 
Mifflin, 1942, p. 155. 

® Pintner General Ability Tests, Verbal Series. Yonkers, N.Y.: World Book, 1939. 



NOTE ON DUNLAP’S REMEDY FOR 
DEFECTIVE COLOR VISION 


J. H. ELDER 
Louisiana State University 

It is a well-known fact that normal color vision is a requirement 
for qualification in certain industrial occupations. Because of the oc- 
currence of some degree of defect in color vision in five to ten per cent 
of the male population, there has been a great incentive during the war 
years to discover some means by which otherwise eligible men in this 
group could be salvaged for military specialties. Diligent research 
sponsored by Army and Navy experts has failed to produce favorable 
evidence for any remedy for color blindness. 

Among the various remedies tested was vitamin A. It appears that 
the credit for suggesting vitamin A for this purpose belongs to Dunlap 
and Loken (1). The general effect of the suggestion, in addition to 
stimulating some unsuccessful experimental attempts to confirm it, 
was to encourage the development of quick cures for men who wished to 
evade the draft by volunteering for officer training. For a time the 
number of such treatments constituted a small racket which was not 
easily curbed. (2) 

While the problem of correcting defective color vision seems less 
urgent with the end of the war, the fact that Dunlap (3) persists in an 
elaborate defense of his original unsubstantiated notions on the effec- 
tiveness of vitamin A requires some examination. 

In reading Dunlap’s latest report one is reminded constantly of the 
difficulties of conducting research from an academic base on a shifting 
wartime population. The incidence of frustrated plans; the prevari- 
cating, vagrant, and othervdse unreliable subjects; and the sudden in- 
duction of men on the threshold of a cure — all tend to make the reader 
of the report sympathetic. Under such difficulties it is not surprising 
that the results obtained were of little value. 

Aside from some original and very novel speculations as to the 
nature, distribution, and possible causes of color blindness the only new 
material presented by Dunlap consists of 14 case histories of individuals 
who received vitamin A and cobra venom. In some cases perhaps 
vitamin B was given. The criterion tests were Loken’s modification of 
the Nela tcst,|and the Ishihara and American Optical Company pseudo- 
isochromatic plates. 

Itj 13 of the 14 cases presented the treatment was not complete, 
according to the author’s own statement. In the other case the con- 
clusion was that the defect was irremediable. If the summary state- 
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ments given on each case may be interpreted literally, then Dunlap 
himself rejects 12 of the 14 cases as questionable. On the basis of results 
presented on tests before and after treatment he should have rejected 
14 out of 14. 

To fully appreciate the complete failure of these 14 cases to give 
support for vitamin A treatment it would be necessary to examine the 
data and to comment on each one, but detailed review here is not 
justified. Of the group as a whole Dunlap says: ‘‘While no conclusions 
could be drawn from these cases, their features suggest certain proba- 
bilities and agree with some of the notions which had occurred to us 
earlier.” 

As a check on Dunlap’s original report that 80 per cent of his cases 
were able to pass chart tests after taking vitamin A, Elder (4) examined 
approximately 900 R.O.T.C. cadets at Louisiana State University. Of 
the 65 men showing various degrees of defective color vision, 41 com- 
pleted the course of treatment. Results were entirely negative. The 
method used in this experiment closely paralleled that of Dunlap’s even 
to the use of Ishihara and American Optical Company charts which he 
now condemns as unreliable but continues to use. 

Dunlap suggests that pro-vitamins were used in the LSU experi- 
ment and that the units of vitamin A administered were unknown, 
presumably because the material was obtained from a questionable 
source. A careful reading of my report shows that ‘‘the material used 
was a vitamin A ester of high potency, determined spectrographically 
and confirmed by bio-assay” and that it was supplied by the Norwich 
Pharmacal Company. The implication that vitamin products of this 
company may be inferior to those of Squibb, Lilly, or Upjohn, indicates 
that Dunlap is unaware of Norwich research facilities and its systematic 
assay of all vitamin preparations. 

If Dunlap persists in defending the notion that vitamin A has a 
beneficial effect on defective color vision, we are entitled to request that, 
in his next study, he observe the following conditions: 

a. Use only subjects with demonstrable defect (by any standard criterion 
he chooses). 

b. Leave nothing to be presumed as to whether subjects take their doses 
regularly and in prescribed amounts. Supervised administration is not difficult. 

c. Carry through complete treatment on at least a few cases so that there 
will be no need for guessing as to what might have happened if doses could 
have been continued for a few more weeks. 

d. Take necessary precautions against the possibility of learning the tests. 
(It is known that many men have passed service tests in this way. Confirmation 
of final test results with an additional test which the subject had not seen previ- 
ously is desirable.) 
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It is impossible to abandon completely the hope that Dunlap's 
notions may lead to new knowledge of treatment for defective color 
vision. Until we have that knowledge in the form of thoroughly reliable 
observations, further discussion is superfluous and misleading. 
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Linton, Ralph (Ed.) The science of man in the world crisis. New York: 

Columbia Univ. Press, 1945. Pp. xiv4-S32. 

The science of man referred to in the title of this symposium is 
anthropology. The editor, himself an anthropologist, feels that the time 
is ripe for a synthesis of the sciences which deal with human beings and 
their problems. Though ‘‘anthropology is by no means the only dis- 
cipline which has concerned itself with the study of man,” he maintains 
that it holds a central position among these disciplines and that it 
should assume the responsibility for synthesizing their contributions. 

The purpose of the particular synthesis attempted in this volume is 
to give people who are planning a more satisfactory world order a better 
understanding of the potentialities and limitations of the human 
material with which they are dealing and to provide them with tech- 
niques for controlling that material. Several chapters deal in a practical 
and helpful way with world resources, world population problems, and 
the administration of Indian and colonial affairs. One chapter shows 
how the results of the analyses of American radio audiences and pro- 
grams might be helpful to a world government in its attempts at in- 
fluencing public opinion. 

Though there are several references to the need for a study of the 
biological man, no biologist is included among the 22 contributors. This 
is the more surprising since the anthropologists seem agreed that culture 
has no existence apart from the individuals who originate, transmit, and 
change it, and, therefore, that culture must bear the imprint of human 
nature. 

Klineberg is the only psychologist contributing to this volume. His 
chapter on racial psychology is an excellent, carefully worded, brief 
survey and interpretation of psychological research in the area of 
racial differences. 

In their discussions of cultural assimilation and change, several of 
the anthropologists lean heavily on learning theory as developed by the 
Yale Institute of Human Relations. Cultural changes are explained in 
terms of their capacity to relieve anxieties or to provide positive satisfac- 
tions. On the other hand, it is evident that some antiiropologists do not 
find this theory adequate to explain all aspects of cultural transmission. 
The child seems to acquire some of the culture of his society in an 
automatic fashion in which it is difficult to discern any signs of either 
drive or reward 

One chapter reports a symposium on the concept of culture. The 
participants arc several anthropologists, a psychologist, an historian, a 
lawyer, a physician, a psychiatrist, a philosopher, a business man, and 
an economist. Psychologists will find it revealing and chastening to get 
the anthropologist’s reactions to their theories, approaches, and 
terminology. 
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It is evident from this book that modern anthropology is increasingly 
interested in the universals of human nature and of human cultures. 
One chapter, on the common denominators of culture, deals entirely 
with the surprisingly large number of activities, such as religion, faith 
healing, dancing, family living, incest taboos, fire making, and cooking, 
which are found in all cultures, past or present, throughout the world. 
Instead of attributing these universals in a facile fashion to instincts or 
even to simple biological needs, the author shows in a clear and pene- 
trating fashion how they are the outcome of the complex interplay of 
these needs with secondary, acquired motives developed as a result 
of the world wide functioning of the principles of learning. 

This book marks a welcome, albeit modest, advance toward an 
actual synthesis of the sciences which study man. Its principal value 
to psychologists will probably be (1) as evidence of the values to be 
derived from closer cooperation with the other biological sciences and 
(2) as a survey of the frontiers of research in anthropology proper. 

Clarence Leuba. 

Antioch College. 

Kardiner, Abram, (with the collaboration of) Linton, Ralph, Du 

Bois, Cora, and West, James. The psychological frontiers of society. 

New York: Columbia Univ. Press, 1945. Pp. xxiv 4-475. 

Rarely are psychologists presented with a book more challenging 
in its social implications than this one. It is truly a cooperative en- 
deavor, involving not only the skill of the authors named in the title 
above but also of Dr. Emil Oberholzer who analyzed independently, 
or without collusion, 38 Rorschach tests which had been administered 
by Du Bois to natives of Alor. The raw data were supplied by the col- 
laborators. They consist of careful descriptions of three cultures, 
namely, the Comanche culture as described by Linton, the Alorese 
culture pictured by Du Bois, and the culture of an American rural 
community presented by West. Although the analyses and interpreta- 
tions of these cultures, as well as the conclusions, are the work of 
Kardiner, there is evidence indicating a frequent interchange of ideas 
arnong all of the collaborators with the exception of the Rorschach 
specialist. The result is a well-coordinated book in spite of the diversity 
of its subject matter. 

The concept which is indispensable for the interpretation of the 
different societies is basic personality. There can be no doubt that 
Linton in his Foreword and Kardiner in Chapter II, The Technique of 
Psychodynamic Analysis, but more especially in the main body of the 
book, have indicated the anatomy and dynamics of basic personality 
with greater clarity than they achieved in an earlier volume entitled. 
The Individual arid His Society. Preferred to all others in deriving 
the characteristics of basic personality is the psychoanalytic technique. 
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Not, however, basic drives or “instincts” but action systems with 
identifiable perceptual, coordinative, and executive features are the 
fundamental constituents of basic personality. There is a generous 
departure from Freudian theory, a departure dictated somewhat by the 
anthropological data. It must not be assumed that Kardiner has ap- 
proached his task with ready-made concepts, or that he has forced the 
data into preconceived molds. The nuclear constellations of basic 
personality, differing as they do from society to society, have been 
distilled from materials collected by the anthropologists. Kardiner, 
therefore, has not merely repeated the anthropological findings to 
append to them facile interpretations, rather he has painstakingly 
studied the data to derive his concepts from them and to uncover 
relationships not readily discernible to the original investigators. He 
has shown, for instance, how in infancy and early childhood the impact 
of maternal care, sibling interactions, paternal dominance or its absence, 
and of conduct more indirectly related to institutions, establish action 
systems which in their dynamic interactions yield basic personality. 
These action systems give rise to projective systems which later deter- 
mine, for example, religion, folklore, and the arts. The projective 
systems are supplemented by reality systems which deal with explana- 
tions of the outer world and ways of dealing with it, or with conven- 
tionalized means of getting along with other people. Projective systems 
and reality systems together constitute basic personality but the former 
are both primary and dominant; in fact, the projective systems exercise 
a directing influence over the reality systems. 

Kardiner does not stop with the derivation of the type of basic 
personality characterizing the three cultures studied intensively, in- 
stead he proceeds to demonstrate how the basic personality, in turn, 
explains many of the differences among the cultures and within each 
society. He has sought, in a measure, to deal also with the problem of 
variability among the members of a given society, a problem which 
was neglected in the earlier volume referred to above. In this task he 
derives considerable aid from biographies of persons in Alor and in 
Plainville, U.SA. 

The similarity of interpretation between analyses of personality 
made by Kardiner and by Oberholzer, who did the Rorschachs “blindly,” 
is striking. At least this is so for what Kardiner calls the diagnostic 
features but not for the dynamics of behavior. Even though we are 
told there is no substitute for actually living with people, the corre- 
spondence in interpretation leaves us wondering which method of 
analysis contributes more to the validity of the other. Both methods 
combined suggest problems not likely to have been discovered by either 
one alone. 

In th?. final or fourteenth chapter, Kardiner permits himself a flight 
of controlled imagination, a thought provoking and intriguing in- 
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terpretation of history in accordance with his concepts of basic per- 
sonality. His discussion of the social significance of projective systems 
and the part they play in stabilizing the social order is packed with 
meaning for the psychologist and sociologist. 

Only in the first chapter does the author appear to betray a slight 
aggressiveness not accounted for by the psychological techniques he 
condemns. Classical psychologies generally associated with the names 
of Wundt, Watson, Pavlov, and of Wertheimer, Kohler, and Koffka, 
are criticized because they render little or no aid in determining the 
dynamics of social behavior. Lewin's topology is dismissed in three 
sentences, or shall we say it is identified with psychoanalysis as an 
experimental handmaiden? Yet Kardiner throughout the book does not 
hesitate to write of circumstances conditioning the child one way rather 
than another. Had he desired to trace carefully and specifically how 
the earliest components of action systems were learned and integrated, 
I suspect his indebtedness to Pavlov would not be less than that owed 
to Freud. Psychoanalytic methods are not applicable to all aspects of 
experience and behavior, nor are the methods of any one of the current 
psychologies Kardiner has found wanting. Since psychoanalytic 
techniques and concepts enabled him to explain even the apparently 
unrelated forms of behavior, institutionalized and otherwise, he did not 
need to defend his preferred psychological system through a series 
of counter-attacks. 

The book is not a text for undergraduate students, although the 
anthropological accounts will have an intrinsic appeal to them. A 
considerable knowledge of psychodynamics is required to follow and 
understand the analyses of the several societies. Graduate students 
in psychology, sociology, and anthropology, and especially their in- 
structors, have been challenged to reexamine their basic concepts and 
interpretations of social behavior. They cannot afford to neglect a 
volume as richly suggestive as this one. 

Charles Bird. 

University of Minnesota. 

Lazarsfeld, P. F., Berelson, B., &Gaudet, H. The people's choice. 

How the voter makes up his mind in a presidential campaign. New 

York: Duell, Sloan, & Pearce, 1944. Pp. vi -1-177. 

This book is not a report of a routine political public opinion poll 
but is a cogent analysis of the factors that influence the voter in making 
his decision. As such, a careful reading will repay sociologists and 
political scientists as well as psychologists. 

The first two seccions lay the background by describing the people 
and social milieu of Erie County, Ohio, chosen as the sampling area 
because of its relative representativeness. Two sections are devoted to 
a discussion and interpretation of the data. Appendices refer the reader 
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to appropriate source material and elaborate certain technical items. 

The data were gathered by eight waves of personal interviews 
obtained at one-month intervals just prior to the 1940 presidential 
election. Three control groups were used to determine whether the 
initial interviews with the main panel would make them “election 
conscious*' and influence responses on subsequent questionnaires. 
Intensive repeat interviewing permitted a thorough study of voters’ 
characteristics, but more important, allowed respondents’ attitudes and 
opinions to be related to events of political significance soon after they 
occurred. 

Social and cultural factors are demonstrated to be marked deter- 
minants of voters’ preferences, and although people made up their 
minds to vote in a given way at different stages of the campaign, they 
tended to do so in terms of their political predispositions. The political 
campaign developed interest in the election for the indifferent, but did 
little converting of the partisan. Campaign arguments merely provided 
the latter with rationalizations to support his preferences. “Opinion 
leaders,” who were more alert to events but more unresponsive to 
arguments opposing their predilections, were found to affect the mass 
of voters more than formal discussions in magazines and newspapers 
and on the radio. 

The information presented regarding election behavior is in itself 
important, but in their ability to evoke meaning from the data, the 
authors made their greatest contribution. Fractionation and recombina- 
tion of the data bring out pertinent relationships that would not be 
apparent from a simple tabulation of responses. Some will regret that 
the methods employed in gathering the data were not treated in greater 
detail, but few will deny that the use of scientific polling techniques as 
demonstrated in this book will make possible the addition of new con- 
cepts in social psychology. 

Lester Guest. 

The Pennsylvania State College. 

Lindner, R. M. Rebel without a cause; the hypnoanalysis of a criminal 

psychopath. New York: Grune & Stratton, 1944. Pp. ix+296. 

For those who have never had first hand experience with psycho- 
analysis — nor read James T. Farrell — this book will be both interesting 
and educative. The suggestion of Sheldon and E. T. Glueck in their 
short introduction that the book is a significant milestone in crimi- 
nology, however, is rather startling in view of the fact that it presents 
the results of only one case and uses a method that is by no means new 
in spite of the author’s seeming claims in that direction. 

Two hundred and fifty-nine of the 289 pages of text are the actual 
stenographic results of the psychoanalysis of Harold, who is psychiatri- 
cally diagnosed as a psychopathic personality and who is heavily 
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sentenced to the penitentiary for a serious crime, the nature of which 
is not disclosed. During the analysis Dr. Lindner meets the customary 
resistance from his subject and speeds its resolution with the aid of 
hypnotic recall, justifiably pointing out that this valuable therapeutic 
technique has been too cavalierly dismissed by analysts ever since 
The Master discovered it was not a panacea. 

The conclusion of the author that his method has verified *‘the major 
but until now unproven hypotheses of the traditional psychoanalytic 
view of this entity” (psychopathic personality) seems presumptuous 
in view of the fact that (1) he has not been dealing with an entity, and 
(2) the logical meaning of proof is something more than restatement on 
the basis of one case. Yet the fact that hypnoanalysis as a method 
of diagnosis and treatment has great value is overwhelmingly evident 
even when the actual results are down in black and white and therefore 
robbed of some of their emotional force. That Dr. Lindner has skill- 
fully conducted the treatment is also apparent, yet it will be this neces- 
sary skill combined with an admittedly rare rapport which will limit 
the widespread acceptance of this method for practical penological 
uses. 

The first and second chapters discuss respectively the problem of 
criminal psychopathy and the author’s method employed in its study. 
With the exception of a short summary chapter the remainder of the 
book is given over to the analysis record with occasional interpolated 
remarks by the author. 

With a brilliant stroke of the technical mot juste, the author de- 
scribes the definitive and classificatory role of psychopathic personality 
in the psychiatry of the past as a Pandora’s ]^x. The force of the 
comparison for critical purposes, unfortunately, is somewhat clouded 
in his succeeding discussion when the psychopathic personality quickly 
becomes a ’’they” to which are attributed both general and specific 
typing characteristics. It is indeed strange to read on the first page of 
chapter one, for example, that this catch-all category has no accepted 
delimitation, and then on the last page of chapter two that the author’s 
method has finally ’’penetrated to the core of psychopathic personality 
for the first time in the long history of psychological concern with this 
puzzling classification.” Nowhere in the pages between or the ones that 
follow is there presented any definition of this ’’entity” which would 
give it a core to be penetrated. 

The situation is further clouded by the facile use of analytic circular- 
isms which, as usual, pose as explanations. Although the author recog- 
nizes that the statement; ’’The psychopathic personality super-ego is 
weak” is circular (he uses the words ’’self-explanatory”), he feels no 
dissatisfaction when he explains the acknowledged infantile behavior 
of the psychopath as ”an abrupt cessation of psychosexual develop- 
ment.” In addition to the possible multiplication of such logical 
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sterilities, the author can also be charged with illegitimate borrowing of 
what may well be principles of physiology but are certainly fictions at 
the level of total behavior. Cannon's canon of homeostasis is indeed 
well illustrated by the intricate nervous regulation of man's internal 
environment and in this connection serves as a big gun, but it is hardly 
permissible to say that the adjustive behavior of the neurotic is com- 
parable to the range in pH content of the blood. Nor is it permissible, 
to the reviewer at any rate, to conclude even speculatively in favor of a 
structurally defective brain in psychopathic personality because the 
behavior of the latter is without restraint and the supposed function 
of higher nervous centers is to exercise restraint over lower centers. 

The problem of the legitimacy of infantile memories is solved to the 
author's apparent satisfaction by comparing the motor reactions of his 
subject during hypnotic recall with the Gesell norms for child develop- 
ment. Harold's crucial six to eight months' memories are accepted as 
valid, but the author does not report whether or not at this time his 
subject was putting his own foot in his mouth. 

G. Raymond Stone. 

Indiana University. 

Engle, T. L. Psychology: principles and applications. New York: 
World Book Co., 1945. Pp. ix+549. 

This textbook was written for high school students who will probably 
take no further formal courses in psychology. It is a self-contained 
presentation and omits bibliographic references to other volumes or 
experimental literature. Accompanying the text is a Teacher's Manual 
containing some of these references and other material to facilitate the 
subject presentation. The author had extensive experience as a teacher 
of high school psychology before moving to the college level. 

Throughout the text, the author prepares a factual and attitudinal 
foundation for students who will continue their exposure to psychology 
only through such “advanced ‘texts’ . . . (as) magazine articles, radio 
programs, motion pictures, sermons, and lectures; . . . (while) their 
psychological ‘laboratories’ will be the homes, the businesses, the clubs, 
and the communities in which they will function as citizens.*’ 

Most of the 17 chapters present subject matter of immediate interest 
and practical value to high school students. The following sample 
chapter headings illustrate the functional style which keynotes the 
entire volume: Friendship and Love; Popularity and Leadership; Im- 
proving Learning Techniques; Getting in Touch with Our Environment; 
The Senses; Unusual Personalities ; and the final apt chapter heading. 
Concerning Several '^Mysterious" Matters (hypnosis, dreams, etc.). 
However, there are three chapters which are less consistent withjjthis 
central theme. Chapter I, An Introduction to the Science of Psychology^ 
includes a discussion of the scientific method and a short statement of 
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the special fields of psychology. This material is, by necessity, highly 
abstract and would probably be more meaningful after the student has 
become familiar with some specific problems and procedures of psycho- 
logical analysis. Chapter II, Preparing to Read Psychology, is a summary 
of basic statistical methods. The same criticism applies here particularly 
since far less use is made of statistical terms in the remaining chapters 
than is indicated by the emphasis given in this chapter. Both of these 
chapters have considerably less inherent interest to the high school 
student than the information in the rest of the book and it is unfortunate 
that this traditional, though questionable, introduction should be 
presented to students of this level. 

The second of two chapters on learning is better than the first since 
it is an application of fact and theory to the learning-in-school situation 
while the first is a digest of the basic concept and experimental tech- 
niques used in both animal and human studies. Also included in the 
first chapter on learning is a well written discussion of thinking. This 
section might profitably have been expanded into a separate chapter. 

This must have been a difficult text to write. Undoubtedly it is 
easier to list in summary form a series of related psychological studies 
than to interpret these findings in a way that will answer fundamental 
psychological questions confronting high school students. Many times 
the author must have wished that psychologists would or could find 
ways of examining the psychological attributes of more daily bread 
and butter problems. However, rather than be tempted into a series of 
student-interest lessons of psychology applied to daily life, the author 
consistently presents and interprets information drawn from scientific 
and critical source material. 

This book includes a large number of illustrations and examples 
drawn from contemporary events — particularly material resulting from 
psychological and related scientific contributions developed during the 
war. As this information becomes more widely disseminated and con- 
tinues to be released from military restrictions, the volume will acquire 
even greater practical applicability for high school students and 
teachers. 

The reviewer recommends this textbook for examination by every 
teacher of high school psychology. This book should aid in making 
available to a large number of younger students the more practical 
results of psychological research and analysis. 

Stanford C. Ericksen. 

U niversiiy of A rkansas. 

Bird, Charles & Bird, Dorothy M. Learning more by effective study. 

New York: D. Appleton-Century, 1945. Pp. viii+27S. 

This is another book on how to study. The fact that more than forty 
books and manuals on this subject have been published in the United 
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States since 1926 is indicative of the importance of the problem treated. 
It may also signify that individuals working in this field have not been 
entirely satisfied with the productions of their colleagues and co-work- 
ers. A major weakness of many of the available texts on how to study 
is that their numerous suggestions have been based largely on opinion 
rather than upon research in the field or upon sound principles of the 
psychology of learning. 

This new book by the Birds, however, is refreshingly different in 
this respect. The procedures for effective study which they advocate 
all appear to be founded upon well validated psychological principles 
and theories which have come from the experimental literature. Instead 
of giving numerous conflicting and confusing rules for study, this book 
suggests a few highly significant methods and principles which have 
wide applicability. The authors, furthermore, appear to be concerned 
with the total personal adjustment of the student as well as with his 
academic achievements. 

The book is not only psychologically sound in its approach, but it is 
conservative and accurate in its claims as to the amount of improve- 
ment that can be expected to result from special attempts to develop 
better study habits. For example, in discussing the progress made by 
college students who devoted eight weeks to the improvement of their 
speed of reading, the authors state that ‘‘they read 16.5 per cent faster 
in the final than in the initial test.’* (99) Although this finding is un- 
doubtedly typical of the improvement one might expect to find in this 
particular skill during such a period of practice, it is quite unusual to 
find authors who do not make much more sweeping and extravagant 
claims. 

On the whole, the book Learning More by Effective Study represents 
a very useful contribution to our knowledge of the psychology of study. 
College students, high-school seniors, and adults who will carefully fol- 
low its suggestions can with confidence expect to make substantial im- 
provements in their study techniques. 

Glenn M. Blair. 

University of Illinois. 

Abrahamsen, David Men, mind and power. New York: Columbia 

Univ. Press, 1945. Pp. viii+155. 

A psychiatrist and refugee from Norway holds psychoanalytic 
post-mortems on the German nation, Hitler, Goering, Goebbels, Him- 
mler, Quisling and Laval. 

The people of Germany are diagnosed somewhat in Adlerian fashion 
as chronically suffering from feeling of inferiority with pronounced 
compensatory strivings for recognition. Their humiliation from defeat 
in World War I touched such a new low that temporarily they were at 
a complece loss for a way of compensation. Into this situation stepped 
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those arch-antinomians, Hitler, Goering, Goebbels and Himmler, acute 
sufferers from long-standing personal inferiority-feeling who thought 
they saw at once for themselves and the nation a way of salvation — a 
way to a thousand years of pax Germanica — in a projection of the causes 
for German misery upon the Jews, Communists and capitalistic nations 
and in a ruthless destruction or subjugation of these imaginary enemies. 
The Aryan population of Germany became as one in enthusiastic ac- 
ceptance of this highly neurotic fiction which alone gave promise of a 
restoration of status and which was to eventuate in so much world- 
havoc. In anti-German countries, particularly France and Norway, 
elements of the population like Quisling, Laval et al., long smarting 
from frustration and non-appreciation, saw likewise a certain salvation 
for themselves, an opportunity for recognition in a throwing of their 
sympathies and activities with Hitler’s new order against the interests 
of their own countrymen. 

The latter part of the book is devoted to psychotherapeutic measures 
which Abrahamscn thinks should be taken by the United Nations in 
the interest of a permanent cure for this grand status-neurosis of the 
German people. Fully cognizant of the delicacy with which treatment 
must be undertaken by outsiders if the trouble is not to be exacer- 
bated, the author indicates among others two rather significant ap- 
proaches: 1. the replacement of men teachers in the schools by whole- 
some minded women; 2. a defiihrerizingof the father, i.e. a democratiza- 
tion of the German family set up. 

The nuclear ideas of this book appear of interest to the general 
reader and most certainly provide food for thought for criminal and 
social psychologists. Unfortunately these ideas are somewhat obscured 
by the author’s style which might be described as semi-free association. 

F. C. SUMNHR. 


Howard University, 
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The study of eye movements continue^ to be an important and much 
used technique for investigating the reading process. An overall sum- 
mary of the field up to 1935 is given by Tinker (108). The last review 
(107) to appear in this journal was published in 1936. In the present 
review, material appearing during the years of 1935 through 1944 is 
considered. Bibliographies, critical evaluations, summaries of parts of 
the field, and textbook discussions will be found in references (13, 37, 
38, 41, 74, 95, 97, 103, 116, 117). A few titles not available to the re- 
viewer have been included in the bibliography for the sake of complete- 
ness (6, 29, 30, 46, 47, 48, 49, 126). Certain material on eye movements 
will not be reviewed since it is not closely related to the reading process. 
This includes reports on eye movements in viewing pictures and adver- 
tisements, certain studies of visual fixation, eye movements as related 
to visual attention, and the like. 

The studies to be reviewed here group themselves as follows: tech- 
niques of measurement, analysis of the reading process, oculomotor 
coordination, factors in variability, special reading situations, heredity 
and certain environmental factors, eye movements in the reading clinic, 
training eye movements, and central versus peripheral processes. 

Techniques of Measurement 

Apparatus. Although some of the new apparatus for recording eye 
moyements consists of modifications of earlier techniques, others in- 
volve new principles. Brandt (9) describes an adaptation of the Dodge 
corneal reflection type of camera. Its special feature is simultaneous 
recording of the vertical and horizontal components of eye movements. 
A portable model of the Dodge type of camera, the Ophthalm-o-graph,^ 
is described by Taylor (105). Records of horizontal eye movements of 
both eyes are recorded simultaneously. Tiffin and Fairbanks (106) have 
developed a camera for simultaneously recording eye movements and 
'voice reactions in oral reading. Miles’ (70) note on the first eye-move- 
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ment record photographed by Dodge has historical interest. A demon- 
stration aim on eye movements in reading has been prepared by Valen- 
tine, Troyer and Brown (118). 

An electrical method of recording eye movements is finding some 
use. Changes in the corneo-retinal potentials are closely proportional 
to the sine of the angle of rotation of the eye. With proper placement 
of electrodes and with an amplifying system, the movements may be 
recorded mechanically or photographically. Miles (71) and Halstead 
(40) suggest that eye movements in a variety of situations may be 
measured by this technique. Its validity for recording eye movements 
in reading has been checked by^Hoffman, Wellman and Carmichael (43). 
Electrical and corneal reflection records were made simultaneously while 
their subjects read and while they made horizontal and vertical eye 
movements in viewing a dot pattern. The only data given for the. 
reading responses was a sample record. Although the two techniques 
yielded similar records, they were far from identical. Where extent of 
eye movements and duration of fixation pauses for viewing the dot pat- 
terns were measured, the agreement as shown by correlations is close 
but again the records do not yield identical results. Although this elec- 
trical technique has sufficient validity to justify its use in certain situa- 
tions where it is impossible or inconvenient to photograph the corneal 
reflection, it is clear to the reviewer that it is not satisfactory in its 
present form for studying eye movements in reading. 

Another electrical technique, recording photographically (electro- 
myograms) the action currents generated by movement of extrinsic eye 
muscles during reading, has been used by Luckiesh (61) and Luckiesh 
and Moss (62, 63, 64, 65). The sample record given by Luckiesh (61, 
p. 128) is either not typical or the author has read the time intervals 
wrongly. He gives 0.120 sec. for time of back sweep when the typical 
value is approximately 0.050 sec. The duration of his interfixation 
movement also appears unduly long. Furthermore, the demarkation be- 
tween successive fixation pauses is not always clear. A direct comparison 
of this technique with the corneal reflection method is needed to check 
the validity of the former. The corneal reflection technique of photo- 
graphically recording eye movements, therefore, remains the most satis- 
factory method for use in reading investigations. 

Reliability, The reliability or consistency of eye-movement records 
has been considered by several workers. Imus, Rothney and Bear (44), 
using fifty word samples of text, found that eye-movement measures 
had low reliability (r — .59 to .72). In another report Imus, Rothney 
and Bear (45) re-emphasize the unreliability of eye-movement records. 
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Employing the same material, i.e., that furnished with the Oph- 

thalm-o-graph, Anderson (3), and Broom (12) note similar trends in 
low reliability. In a more extensive study Tinker (112) found that re- 
liability of eye-movement measures was low only when short samples of 
text (5 to 15 lines) were employed. When 20 to 40 lines were read the 
reliabilities were about .80. Thus eye-movement records will have 
adequate reliability only when a satisfactorily long sample of reading is 
used. When only group comparisons are involved, short samples are 
adequate. But in the clinic, where individual diagnosis occurs, records 
from reading at least 20 lines of text are indicated. An even longer 
sample is desirable. 

Validity. If eye-movement measures are to have meaning for valid- 
ity evaluation there must be adequate comprehension of the material 
read. Users of the Ophthalm-o-graph for photographing eye movements 
have commonly employed short samples of reading material with ques- 
tions on comprehension furnished with the apparatus. Imus, Rothney 
and Bear (45) have found this check on comprehension to be useless. 
Authors such as Broom (12), Anderson (1, 3) and Imus, Rothney and 
Bear (44, 45), have reported very low validity coefficients for eye- 
movement measures of reading when correlating eye-movement meas- 
ures with scores on reading tests. Such a comparison constitutes an un- 
satisfactory test of validity since it is well known that reading scores 
derived from different reading situations are not related or yield only 
low correlations. Tinker (112) has shown that when the material read 
before the camera is strictly comparable to that in the performance 
tests the validity of fixation frequency and perception time (fixation 
frequency times pause duration) was very high (r=— .80 to —.99). 
Pause duration by itself was found not to be a valid measure of reading 
performance. 

Other evaluation. Is a reader able to give a typical reading per- 
formance in the apparently artificial laboratory situation where the 
photographing of eye movements takes place? Tinker (112) has checked 
this and found that the performance was the same before the camera 
as when working at a table.. This finding has been confirmed by Gilbert 
and Gilbert (34). It should be noted, however, that it is important to 
adapt a reader to the laboratory situation when photographing eye 
movements. 

Stromberg (101) has questioned the common practice of plotting the 
fixation points of the eyes to show the location of these fixations on the 
material read. This has ordinarily been done for a single eye, assuming 
that the two eyes coordinate in fixating on the text. When Stromberg 
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plotted the fixations for both eyes he found what appear to be startling 
discrepancies in binocular fixation. That is, the fixation of one eye fre- 
quently did not coincide with that of the other eye. Undoubtedly there 
has been unwarranted optimism concerning the usefulness of plotted 
fixations to show exactly where the eyes paused. Since one fixates an 
area rather than a point, and because movement of the reflected beam 
does not correspond exactly to the visual angle of movement, plotted 
fixations can only represent approximations to the correct place of fixa- 
tion. One suspects that part of the binocular fixation discrepancy found 
by Stromberg was due to the reflected spot of light not being in cor- 
responding locations on the two eyeballs, a condition which would pre- 
vent correspondence on the photographic record. Nevertheless, this 
study does raise an important question concerning interpretation of 
plotted fixations. 

In general the evidence indicates that under adequate experimental 
conditions, eye-movement records are reliable and valid measures of 
reading performance. 

Analysis of the Reading Process 

The fixaiional pause. Perception occurs during fixational pauses* 
Arnold and Tinker (5) made an extensive analysis of the fixational pause 
of the eyes. On the average a pause of .157 sec. was required to identify 
a letter and .172 sec. to accurately fixate a dot. These are longer than 
the .100 sec. required for a well cleared-up perception. In a variety of 
reading situations the mean pause duration varied from .217 to .404 sec. 
These are much longer than the pauses made in simple fixations re- 
quiring a minimum of perception. The longer durations in reading may 
be explained in terms of the requirements for comprehension and as- 
similation of the materials read. Luckiesch and Moss' (65) statement 
that reading pauses average about .150 sec. is a misrepresentation. 
Reading with comprehension cannot have a mean pause duration as 
short as this. 

An excellent analysis of regressive pauses, i.e., pauses following back- 
ward eye movements within a line of print, in the reading of 9th and 
10th grade children is given by Bayle (7). She found that regressive 
pauses tended to fall into six patterns: adjustment after the first fixa- 
tion in a line, adjustment within a line when the span of vision is over- 
reached in a too long forward move, regressions for verification, re- 
gressions during word analysis, regressions for phrase analysis, and 
regressions for re-examination of a whole line. It was found that regres- 
sions were caused by need to complete perception and by difficulties of 
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interpretation due to failure to recognize the meaning of a word or 
failure to connect meaning of a word with the context. Bayle points 
out that regressive pauses are necessary parts of the reading process 
in analytical reading. Buswell (14) gives a cautious approval of this 
view but many writers ignore it. Recognition of the fact that regressions 
are legitimate and essential in certain kinds of reading would avoid 
some of the misconceptions involved in training eye movements (see 

p. 110). 

The fixation span. This is the amount of material read in one fixation. 
The larger the span, the fewer the fixations and consequently the faster 
the reading. By using reading material with spaces between phrases 
and encouraging readers to fixate each plirase only once, Robinson (87) 
obtained the relation between perceptual span from tachistoscopically 
exposed material with fixation span in reading. For tachistoscopic ma- 
terial not in context this correlation was .52, for material in context 
(using a special tachistoscope) the correlation was .66. He concluded 
correctly that span derived from short exposures predicts eye-move- 
ment behavior inadequately. La Grone (SS) and Knehr (51) also found 
a low correlation between perceptual span and fixation span. Dearborn 
and Anderson (27) have perfected a highly flexible and effective motion 
picture technique for exposing successive phrases in context to increase 
the fixation span. 

In a study of adult reading, Buswell (14) emphasizes that fixation 
span or span of recognition is an important index of reading maturity, 
i.e., a narrow span indicates immature reading habits. He found no rela- 
tion between visual efficiency and span. There is a similar emphasis 
upon fixation span in the author’s second monograph (IS). In ail of 
Buswell’s work (14, 15) there is a sound emphasis upon training to in- 
crease the span of recognition which directs attention to perception 
rather than to the motor aspects of eye movements. 

Luckiesh and Moss (62) have studied the influence of typographical 
factors on extent of the fixation span. They found that span decreased 
slightly in going from four to 10 point type and that it increased slightly 
in going from 13 to 21 aqd 29 pica line widths. Their interpretations 
reveal a decided lack of understanding of the reading process. No ac- 
count is taken of possible accompanying changes in either pause dura- 
tion or efficiency of peripheral vision. 

Oculomotor Coordination 

Rhythm. Much has been made of rhythm in eye movements. Buswell 
(14) states that “a mature reader covers a line of print with regular, 
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rhythmic eye movements*' but admits that reading habits should be 
flexible and that in some situations regressions are desirable. Taylor 
(105) has devised techniques for developing “rhythmical, left-to-right 
eye-movements.** In the non-experimental literature the importance of 
rhythmical eye movements is stressed, i.e., regular forward movements 
along a line of print and about the same number of fixations from line 
to line Sire considered highly desirnble. This emphnsis is derived from 
the notion that the efficient reader is efficient because he is able to estab- 
lish rhythmical eye-movement habits more readily and that favorable 
typography fosters rhythmical movements. Sisson (94) has made a 
quantitative analysis of motor habits of eye movements in reading as 
related to length of the printed line, kind of material read, and level of 
reading ability. Tendency toward rhythmical motor habits was evalu- 
ated by means of a habit index, i.e., the percentage of successive lines 
having the same or nearly the same number of fixations. The habit 
index was higher for reading the shorter lines of print and slightly higher 
for the more efficient readers. This apparent relation is accounted for in 
terms of probability, however, rather than a habit for the fewer the 
fixations, the better the chance of two adjacent lines having the same 
number of fixations. The habit index favored neither the reading of easy 
narrative nor scientific prose. Another alleged sign of rhythmic eye 
movements or short-lined motor habits as they are sometimes called, is 
a characteristic temporal distribution of pause durations which is al- 
leged to be fostered by shorter lines. Although some evidence for such 
a pattern was found, it was no different in short than in long lines of 
print. In view of his results. Sisson suggests that the concept of short- 
lined motor habits is useless. The reviewer would add that this notion of 
rhythmic reading is not only a useless concept but a harmful one. tor 
years the view that rhythmical eye-movement behavior is character- 
istic of effective reading and is highly desirable has directed attention 
toward eye movements. This has led to an undue emphasis upon 
peripheral oculomotor mechanics to the sacrifice of adequate attention 
to the more important central factors of comprehension and assimila- 
tion (see training of eye movements p. 110). The findings of Walker (120) 
are in line with the above conclusions. He found “no evidence of a habit 
of moving the eyes any fixed distance or in any rhythm** in his analysis 
of eye movements of good readers. If rhythm is absent in the eye 
movements of highly skilled readers, as indicated in these studies, the 
view that rhythmical eye-movement patterns are desirable for effective 
reading becomes meaningless. Weaver (123) found a high degree of 
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variability in ocular behavior characteristic of both memorizing and 
recall. 

Motor efficiency. Tinker (111) has ascertained the relation of oculo- 
motor efficiency to reading proficiency. There were no significant cor- 
relations between reading proficiency and either accuracy of visual 
fixation measured in terms of number and extent of corrective moves at 
fixation point or speed of convergence-divergence moves. Only when 
extremes of the group were compared was there a slight relationship 
evident. These findings refute previously held views. Similarly, Strom- 
bert (99) found no differences between fast and slow readers in con- 
vergence and divergence movements during reading. Knehr (51) noted 
that vergence movements in reading monocularly were the same as for 
binocular reading. 

Eye-movement time. The proportion of time taken by interfixation 
movements in reading is relatively small and varies with kind of ma- 
terial read. Tinker (109) found that the per cent of reading time de- 
voted to moves ranged from 5.3 to 9.6 with an average of 7.3. The more 
careful and analytical the reading, the smaller the relative time taken 
by moves. Luckiesh and Moss (63), after noting that there is no clear 
vision during horizontal eye movements in reading, state, ''However, 
this unique and advantageous arrangement is stated to be absent in the 
case of vertical movements of the eyes.” There is no sound basis for the 
latter statement. 

Monocular and binocular vision. Apparently oculomotor behavior is 
about the same whether one or both eyes are used in reading. Knehr 
(51), employing eye-movement measures, found no significant differ- 
ences in performance between monocular and binocular reading. Fur- 
thermore, the oculomotor patterns were alike under the two conditions. 
Clark (23) found that alternating vision which occurs during bar- 
reading, did not appreciably alter reading efficiency as measured by eye- 
movement records. Further experimenting with a larger group of sub- 
jects and with a more adequate sample of reading is needed to check 
these results. Nevertheless there is evidence to support the above find- 
ings. A mass of data indicates that central factors of comprehension 
and assimilation rather than peripheral, i.e., mechanical factors, are 
dominant in determining reading proficiency. 

Factors in Variability 

Variation in subject matter. For some time it has been recognized 
that variation in subject matter produces changes in eye movements. 
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Additional information on the nature of the variation is contained in the 
following studies. Seibert (89) made an analytical study of eye move- 
ments of 8th grade pupils while reading mathematics, biography, ad- 
venture, physical science, history and geography. Differences in all eye- 
movement measures occurred in going from one subject matter to 
another, but these differences are not as significant as the author appears 
to believe. The critical ratios ranged from .00 to 2.53. It is admitted 
that some children maintained a definite pattern of eye movements 
throughout all materials. This indicates a lack of flexibility in reading 
habits. A similar analytical investigation was completed by Stone (98) 
with college freshmen as subjects. They read arithmetic, biology, Eng- 
lish, educational psychology, physical science and social science. Al- 
though the critical ratios are not large, differences that tended toward 
significance were obtained for eye-movement measures in going from 
one kind of material to another. Implications of the findings were not 
discussed. Both of these studies may be criticised for faulty statistical 
interpretations. 

Condition of reader. This category is concerned with such factors as 
maturity level, reader proficiency, stuttering and deafness. Buswell 
(14) has investigated the eye movements of adult readers. Measures 
employed for diagnosing reading maturity include: 

1. The span of recognition or amount of print recognized in a single pause 
depends directly upon fixation frequency. A relatively wide span indicates 
greater maturity. 

2. Rhythmic regularity of progress of eye movements along the line of 
print is considered symptomatic of a high degree of maturity in reading. 

3. For any given kind of reading the more mature reader will make fewer 
regressive movements than the immature reader. 

4. The more mature reader employs eye-movement pauses of briefer dura- 
tion. 

On all these measures the adult readers tended to be less mature 
than high school seniors. In fact the adults were only slightly better 
than 6th grade children in reading material within the comprehension 
range of both children and adults. Buswell emphasizes that reading 
habits should be flexible, i.e., they should adjust themselves to the kind 
of material read. Nevertheless, he failed to note that in general his adult 
readers revealed a marked lack of flexibility of oculomotor behavior in 
going from easy to more difficult material. 

Anderson (2) compared the oculomotor patterns of SO good and SO 
poor readers to find how such readers adjust to changes in the difficulty 
of the reading material, and to reading for different purposes. All eye- 
movement measures distinguished good from poor readers. For both 
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groups, fixation frequency, regression frequency and pause duration in 
creased with the more difficult reading material. Good readers showed a 
greater flexibility in eye movements in adjusting to increasingly difficult 
material. Greatest irregularity in oculomotor patterns occurred in 
reading to remember details. Here the differences between groups was 
present but less pronounced. It was concluded that two important 
determinants of regularity in eye movements are difficulty of material 
read and purpose of reading. In a qualitative analysis of eye movements 
of good readers, Walker (120) discovered that on the average fixations 
were distributed evenly throughout the line for both easy and difficult 
material. Pause duration was greater in^the first part of lines. About 
80 per cent of lines were covered by fixations and the last fixation was 
nearer the end of the line than the first. Extent of forward shift of eyes 
was not related to duration of fixation at end of the shift. 

There is not complete agreement concerning the oculomotor pat- 
terns of stutterers. While Moser (73) found that eye movements in 
silent reading are less effective for stutterers than for normal children, 
Hamilton (42) found no differences. The former considers that stutter- 
ing may cause reading disability while the latter states that stutterers 
do not need special treatment in the silent reading program. Since 
Hamilton’s experiment was more adequately controlled, the more ac- 
ceptable view is that eye movements in silent reading are the same for 
both stutterers and normal readers. In Moser’s (73) study, silent fixa- 
tion was the same for stutterers and normal speakers, but when stut- 
terers talked during fixation, a variety of eye movements occurred. 
During oral reading, differences in oculomotor patterns were associated 
with stuttering spasms. These findings suggest that extrinsic eye 
muscles as well as other muscle groups are not under control during 
stuttering. Working with another kind of handicapped group, La Grone 
(56) found that the eye-movement patterns of deaf children differed 
from those of hearing children. This was especially true for develop- 
mental levels in reading. Although pause duration tended to decrease 
from grade to grade, fixation and regression frequency increased and 
then decreased. The most common kind of regression occurred at the 
beginning of lines, but there were fewer regressions for deaf than for 
hearing children. Since the groups studied were very small, these find- 
ings need checking. 

Oral versus silent reading, A few studies have been concerned with 
eye movements in oral reading. Buswell (14) points out that schools 
have recognized that the processes of oral reading and of silent reading 
are psychologically different. Experimental work yields evidence on the 
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degree of this difference. The eye movements of good, poor and un- 
selected college subjects during silent reading were compared with eye 
movements during oral reading by Anderson and Swanson (4). The 
correlations for the eye-movement measures ranged from .50 to .70. 
Coefficients for the poor readers were somewhat higher. In silent read- 
ing there were fewer fixations and regressions and shorter pause dura- 
tions. Going from poor to good readers, the differences in eye move- 
ments between oral and silent reading increased. Thus the eye-move- 
ment patterns tended to be more similar in the two kinds of reading 
for poor readers. These data show, therefore, that there are common 
elements in oral and silent, reading, especially among poor readers. 
According to Buswell (14) silent reading of adults differs from oral read- 
ing in respect to the degree that vocalization is suppressed. Evidence 
from eye-movement measures while silently reading material designed 
to exaggerate vocalization supports this view. Under these conditions 
the eye movements of poor readers were like movements in oral reading; 
those of good readers were not. 

In studying the eye movements of adults during oral reading, Bus- 
well (14) found that the mean eye-voice span is an excellent index of 
the maturity of a person's oral reading habits. The greater the span 
the more mature the reading. An excellent study of eye movements of 
good and poor readers during oral reading is reported by Fairbanks (31). 
Both eye and voice records were made. As in the above study the eye- 
voice span reflected maturity of reading habits. Although difficulty of 
textual material determined variations in eye-movement measures, the 
good readers were less affected than the poor readers. No fixed pattern 
of eye movements was discovered. Analysis of regressions in relation to 
eye- voice span together with a comparison of the eye-movement pat- 
terns of good and poor readers led to the conclusion that faulty eye 
movements do not cause errors in reading but that the errors are central 
in origin. 

Miscellaneous, Certain studies are not easily classified under the 
above groupings. Simpson (92) found no significant correlation between 
eye-movement measures and time spent in leisure reading. It is con- 
cluded without justification that eye movements are not improved un- 
less specifically trained. In another study, Simpson (91) obtained cor- 
relations of —.38 to —.52 between fixation and regression frequency 
versus college achievement and mental ability. Pause duration showed 
no appreciable correlation with achievement and intelligence. Reading 
test sce-es correlated higher than eye-movement measures with achieve- 
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ment and intelligence. The inference that eye-movement training is 
desirable because the correlations are low is a misinterpretation of the 
data (see training eye movements p. 110). Imus, Rothney, and Bear (45) 
found that the correlations for eye-movement measures versus college 
aptitude, scholastic achievement and reading test scores were only ,07 
to .38. Their conclusion that eye-movement measures are unreliable and 
invalid is not justified (see reliability p. 94 and validity p. 95). In a similar 
study with children, Leavell and Sterling (58) found low to moderate 
correlations between eye-movement measures and intelligence. Taylor 
(105) obtained significant differences between eye-movement measures 
of children making normal school progress and of those failing. The 
grade norms given by Taylor are of doubtful value because an unre- 
liable sample of eye movements was obtained. In fact Taylor was largely 
responsible for providing the inadequate samples of reading material 
employed by subsequent users of the Ophthalm-o-graph. They are 
too short to yield reliable records. 

Peripheral vision. For years writers have been emphasizing the im- 
portance of peripheral vision in reading. Cues perceived in peripheral 
vision not only furnish premonitions of coming words and phrases, but 
they also guide the movements of the eyes to new fixations along the 
lines of print. La Grone (55) has studied the relationship between 
peripheral vision and eye movements in reading. Accuracy of perception 
in the left and in the right peripheral fields was obtained from tachisto- 
scopic exposure. High perception scores in the left peripheral field 
tended to go with fast reading, few fixations and shorter pause duration. 
But high perception scores in the right peripheral field tended to go with 
slow reading, more fixations and longer pause duration. Regression 
frequency was not significantly related to perception scores. Higher 
perception scores either in right or left peripheral fields are said to be 
due to preference. Since vision in the right peripheral field is supposed 
to be important in reading, why do the more able readers have better 
perception to the left? The author misses the point that in tachisto- 
scopically exposed material the long practiced habit of reading material 
from left to right is dominant. It is logical to infer that such a habit is 
more effective for good than for poor readers. Furthermore, a short 
exposure test of peripheral vision is not an adequate measure of periph- 
eral vision as it operates in the normal reading situation where the 
successive fields of vision overlap. While the findings in this study are 
interesting, they do not provide a crucial test of the role played by 
perip’heral vision in normal reading. 
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Special Reading Situations 

Spelling. In a follow-up of an earlier study on spelling (see 107), 
Gilbert (33) obtained records for six children over a three-year period. 
The eye movements revealed growth characteristics in perceptual 
habits. Although there was little change in pause duration, fixation and 
regression frequency decreased and there was evidence of a more effec- 
tive method of attack in studying the words. The more mature spellers 
differentiated between hard and easy sections of words and their ocular 
responses tended toward simple verification reactions. Gilbert and 
Gilbert (35) recorded eye movements before and after training for 
speed and accuracy of perception in learning to spell with 4th, Sth, and 
6th grade pupils. Training resulted in less perception time per word 
and fewer fixations and regressions, but no change in pause duration. 
Thus the training resulted in a larger fixation span and less attention to 
details, but the general pattern of cross study and accuracy of spelling 
was maintained. Here is a good example of how training in perception 
habits with no attention to eye movements yields changed oculomotor 
patterns which reflect more effective central processes of apprehension 
and assimilation. In a later study, Gilbert and Gilbert (36) found that 
spelling is learned more effectively when words are studied in isolation 
than when encountered as critical words in reading. Eye movements 
revealed that when words were in text the reading was disrupted, espe- 
cially for poor spellers. The authors* suggestion that a concentration of 
fixations may represent spelling reactions rather than comprehension or 
recognition difficulties would hold only if the readers were set to learn 
spelling as they were in this experiment due to a prior spelling test. 

Reading music. The reading of music is a highly specialized oculo- 
motor task which involves both vertical and horizontal eye move- 
ments. Lowery (60) notes that individual differences are in terms of 
pause duration rather than fixation frequency. Weaver (124) obtained 
photographic records of eye movements and of keyboard performance 
for IS trained musicians during sight reading and playing of three selec- 
tions at a piano. From one to two notes were read per fixation. A 
musical note is considered equal to a word in reading. Pause durations 
are longer in music reading than in word reading and vary with the 
complexity of note relations. The author’s correlations of .82 to .89 
between pause duration and reading time and of only .27 to .58 between 
fixation frequency and reading time are in contrast to word reading 
where fixation frequency yields the high correlations. This supports 
Lowery’s (60) finding that pause durations provide the individual varia- 
tion. According to Weaver (124), predominance of horizontal or of 
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vertical progression of eye movements depends upon the nature of the 
music. Notes on the treble staff are usually but not always read ahead 
of notes on the bass staff. The eye-hand span was highly variable. 
Regressions appeared to be related to kind of music read. In another 
study, Van Nuys and Weaver (119) investigated eye movements of 12 
experienced musicians while reading musical rhythms and melodies of 
12-note phrases, each presented for a period of 2.8 seconds. Memory 
span equalled the number of notes correctly played for each phrase 
exposed. Regressions occurred more for the melodies than for rhythms 
while pause duration increased with increase in complexity of note rela- 
tions for both melodies and rhythms. Paqse duration tended to be longer 
for rhythms than for melodies and there were more notes correctly 
played per fixation for the rhythms. Findings in these studies of eye 
movements in reading music furnish material for a comparison with the 
reading of words: 

1. Considering a note comparable to a word, the fixation span is similar in 
both kinds of reading.* 

2. Regressions serve somewhat the same purpose in both situations. 

3. Increase in complexity of text brings increase in complexity of eye- 
movement patterns for both music and words. 

4. Reading music involves both vertical and horizontal eye movements 
while reading words in text involves only the latter. There are of course minor 
vertical deviations of the eye in reading a line of print. And the return sweep 
to the next line involving both vertical and horizontal displacement, is com- 
mon to both situations. 

5. In reading words, fixation frequency is the best indicator of reading rate 
but in music pause duration is best. 

Hereditary and Certain Environmental Factors 

Twin studies. Two studies of twin similarities in eye-movement pat- 
terns while reading prose have been reported. Morgan (72), comparing 
eye-movement measures for members of pairs in a well controlled experi- 
ment, found very low correlations (.04 to .24) between artificial pairs, 
higher (.24 to .53) for fraternal twins, and still higher (.61 to .72) for 
identical twins. These coefficients indicate more fundamental bases for 
individual differences in eye movements in reading than is furnished 
by a ‘‘habit’* concept. The “underlying processes of assimilation” con- 
stitute a more logical explanation of the results. Hereditary factors play 
a significant role to the extent shown by the correlations. It is note- 
worthy that pause duration, which is least susceptible to training, 
showed the highest correlations for twins. The author correctly infers 
that, in treating poor reading,' consideration should be given to capabil- 
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ities of the indivicfual rather than exclusively to certain ideal motor 
sequencies characteristic of oculomotor patterns of efficient readers. In 
a second study, Jones and Morgan (SO) report additional evidence on 
twin similarities in eye-movement patterns. Median correlations for 
eye-movement measures in reading revealed a trend similar to that in 
the earlier study: artificial pairs, r = .105; fraternal twins, r«.435; and 
identical twins, r » .530. Three judges had considerable success in identi- 
fying on the basis of general appearance the eye-movement photographs 
of identical twin pairs when they were mixed with photographs from 
fraternal twins. Considering the many factors which may modify 
oculomotor patterns, it is significant that this identification was better 
than chance. These results, therefore, support the earlier conclusion 
that eye-movement patterns in reading are not determined wholly by 
training. Genetic factors are involved to an appreciable degree. 

Emotion. Emotional distu'rbances appear to modify eye-movement 
patterns. Strongin, Bull, and Korchin (102) report that when readers 
expect a shock, visual efficiency in terms of eye movements decreased 
by about 10 per cent and binocular coordination became worse in 14 
per cent of readers. Warren and Jones (122) obtained eye-movement 
records while subjects read in the laboratory and while they were 
exposed in a high place. No significant differences in fixation and regres- 
sion frequency or pause duration were found. But unsteadiness of fixa- 
tions appeared when in the high place, especially for acrophobic subjects. 

Illumination. Luckiesh (61) and Luckiesh and Moss (64, 65) report 
the effect of variation in illumination on eye movements in reading. 
Records were obtained at the beginning and at the end of one hour of 
reading under one foot-candle and again under 100 foot-candles of 
light. Speed of reading was slowed up significantly greater under the 
one foot-candle of illumination in comparison with the 100 foot-candles. 
The difference was due largely to changes in pause duration. Although 
the authors condemn speed of reading as a measure of readability of 
print, they employ eye movements for this purpose, apparently unaware 
that eye-movement measures are speed of reading measures. 

Readability. Eye movements have been used extensively by others 
to determine legibility or readability of print. Bell (8) found no dif- 
ference in eye movements for reading typewriting versus manuscript 
writing, but a significant difference in favor of typewriting versus 
cursive, and in favor of manuscript versus cursive writing. Differences 
were mainly due to changes in fixation frequency rather than pause 
duration. Tinker and Paterson have completed eight eye-miovement 
studies on the readability of type. In a long series of studies (78) these 
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authors had compared speed of reading for various typographical varia- 
tions. Significant differences were found in certain comparisons. The 
analytical eye-movement studies were undertaken to discover the spe- 
cific patterns of eye movements related to the disclosed differences. The 
non-optimal text, which was read slower than the more readable ma- 
terial, was characterized by the following changes in eye-movement 
patterns: 

1. All capitals (slower) versus lower case — more fixations (113). 

2. Old English (slower) versus a modern type face — more fixations (114). 

3. Ten point type: 19 versus nine pica (slower) line widths — more fixations 
and longer pause duration; 19 versus 43 pica (slower) line widths — more fixa- 
tions and regressions plus longer pause duration (79). 

4. Six point type: 13 versus five pica (slower) line widths — more fixations, 
fewer regressions and longer pause duration; 13 versus 36 pica (slower) line 
widths — more regressions and longer pause duration (81). 

5. Type size: 10 versus six point (slower) — more fixations and longer pause 
duration; 10 versus 14 point (slower) — more fixations and longer pause dura- 
tion (80). 

6. Type sizes in optimal line widths: 11 versus eight point (slower) — more 
fixations and longer pause duration; 11 versus six point (slower) — longer pause 
duration (82). 

7. Black print on white background versus red print on dark green back- 
ground (slower) — more fixations and regressions plus longer pause duration 

(115). 

8. Several variations: optimal versus non-optimal (slower) typography — 
more fixations and regressions plus longer pause duration (83). 

These analytical studies of eye movements in relation to typo- 
graphical variations indicate that non-optimal conditions may only in- 
crease fixations or pause duration, or both of these. Regressions may 
increase or decrease. Factors operating to change the oculomotor pat- 
terns include changes in visibility (small print, low contrast between 
print and background), in word form (large print, all capitals), preven- 
tion of optimal use of peripheral vision (very long and very short lines), 
or lack of familiarity with a type face (Old English). 

Visual fatigue. Eye movements, along with other measures, have 
been employed in the search lol adequate measures of visual fatigue. 
Dearborn (26) presents some tentative data which suggests that eye 
movements may reveal visual fatigue. After a long day of visual work 
consisting of proofreading and study, speed of reading was reduced and 
saccadic movements through relatively long angles were retarded. Since 
only two readers served as subjects, these trends need checking. Accord- 
ing .to Clark and Warren (25), during a 65-hour vigil there was no 
consistent trend in variation of fixations, regressions or pause duration 
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for eye movements in reading. Although sporadic and temporary 
changes in binocular behavior occurred for certain individuals, there 
were no uniform trends as the period of wakefulness increased. Since 
compensation is possible, eye movements apparently failed to furnish 
an adequate picture of fatigue effects. Kurtz (54), however, claims 
that eye-movement records reveal fatigue after 30 minutes of severe 
visual exertion. It is difficult to evaluate his data as presented. The 
most thorough evaluation of eye movements in reading as measures of 
visual fatigue has been made by McFarland, Holway and Hurvich (69). 
Thirty minutes of either rapid saccadic exercise or 30 minutes of rapid 
convergence-divergence movements failed to produce any significant 
changes in eye movements while reading. Although their results were 
negative, the authors consider that eye movements ought to provide a 
sensitive indicator of visual fatigue if a more sensitive recorder wer^ 
available. Such an apparatus would yield a more precise measure of 
fixation and of binocular coordination. But fixation and regression fre- 
quency and pause duration in reading would remain unchanged. 

In general these studies reveal that eye movements in reading are 
insensitive indicators of visual fatigue. Only when the fatiguing task is 
extremely severe is there a suggestion that eye movements are modified. 
Two factors are probably preventing fatigue from affecting eye move- 
ments in reading: (a) All experimenters have used only 10 to 12 lines 
of text to be read before the camera. The visual mechanism can com- 
pensate enough to give normal performance for the short periods of 
time required to read such materials, (b) Furthermore, there is no 
adequate way to evaluate precisely the degree of comprehension in the 
short paragraphs of text ordinarily used. The reader, since there is a 
tendency to maintain one’s oculomotor pattern, may read two para- 
graphs with similar eye movements but with different degrees of com- 
prehension. An adequate check of eye movements as measures of fatigue 
can be obtained only with long samples of reading text equated for 
difficulty and so constructed that comprehension is satisfactorily meas- 
ured or maintained constant. 

Anoxemia. Definite and consistent modification of eye movements 
in reading occurs with anoxemia. McFarland, Knehr, and Berens (67, 
68) found significant increases in reading time, i.e., perception time plus 
eye-movement time, and fixation frequency in oxygen concentrations 
corresponding to 15,000 and 18,000 feet. Of greater importance, per- 
haps, was the qualitative variation of ocular movements in oxygen want. 
That is, nystagmoid tendencies and general unsteadiness appeared and 
there was an accentuation of abnormalities such as muscle imbalance. 
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Furthermore, latent oculomotor defects frequently became apparent 
indicating that the procedure may have clinical significance. In general, 
altered oculomotor behavior yielded a sensitive measure of the early 
effects of anoxemia. 

Eye Movements in the Reading Clinic 

Views concerning the usefulness of eye-movement records in the 
reading clinic vary. Kurtz (S3) is probably over optimistic when he 
claims that the importance of a permanent objective record, i.e., eye 
movements, can not be over-estimated. The view of Clark (21) that 
eye-movement photography serves as a valuable objective auxiliary to 
other techniques is more moderate. Opposed to these is the yiew of 
Imus, Rothney and Bear (44, 45) who, in appraising eye movement 
photography as a clinical method, are convinced that the technique lacks 
both reliability and validity. (See criticism of this view under reliability 
p. 94 and validity p. 95.) Tinker (110), in an evaluation of eye-move- 
ment measures, notes purposes for which they may be used. He con- 
cludes, however, that eye-movement records are not indispensable in 
clinical diagnosis of reading status. An exaggerated stress is placed 
upon the study of eye movements in the clinical situation by Hamilton 
(41) and Sievcrs and Brown (90). 

Orthoptic training. Peters (84) found that training with the Binocu- 
lar Synchronizer and the Squint Korrector yielded a reduction in fixa- 
tion and regression frequency. These changes, however, were similar 
to those produced by motivated reading practice alone. More extensive 
studies of this kind are reported by Parkins (75, 76, 77). Apparently 
there was marked improvement in reading with corresponding changes 
in eye movements due to orthoptic training. In this group of studies, 
the results are so presented that it is difficult to evaluate them. The 
reviewer is inclined to reserve judgment until check studies are made. 

Visual abnormalities. Extensive clinical use of eye-movement records 
has been made by Clark. In an initial study (17), comparing the eye 
movements of individuals having normal binocular balance with those 
having muscular imbalance, (exophoria), revealed no differences in either 
fixation frequency, regression frequency, pause duration, or time re- 
quired to complete divergence movements at the beginning of the lines 
of print. For similar comparisons in later reports (18, 19, 22), Clark also 
found no difference in fixation and regression frequency but there were 
differences in vergence movements during reading. The exophoric read- 
ers made significantly greater divergence movements at the beginning 
of lines and required more time to complete these corrective divergence 
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movements. It is suggested that the large divergence movements of 
subjects with muscular imbalance may cause enough fatigue to be im- 
portant in remedial reading programs. Clark (20) reports similar trends 
for a case of high exophoria. After noting that his experiments reveal 
no striking differences in oculomotor patterns between normal readers 
and those with muscular imbalance, Clark (24) concludes correctly that 
such deficiencies are apt to cause undue fatigue and should be corrected. 
His belief, however, that visual deficiencies cause reading disability in 
spite of the negative evidence is not well taken. In a correlational study, 
Stromberg (100) found no relation between amount of vergence move- 
ments and phoria of eye muscles. This, however, does not negate 
Clark’s contention that it desirable to correct muscle imbalance for 
the individual case. 

Remedial reading. In an extensive study of remedial reading at the 
college and adult levels, Buswell (15) employed the eye-movement 
camera to measure with precision the changes in perceptual habits 
which occurred during the experiment. He states, however, that an 
eye-movement camera should be considered a research rather than a 
clinical instrument. Simpler methods of measurement which are en- 
tirely satisfactory in an ordinary remedial program are available. Once 
a procedure in the reading clinic has been validated by eye-movement 
records, there is no need to recheck the procedure in other remedial 
groups. 


Training Eye Movements 

Problem. The early experimenters were concerned with identifying 
and describing oculomotor behavior during reading. As this field of in- 
vestigation developed, differentiating factors in eye-movement patterns 
were more emphasized. That is, stress was placed upon developmental 
variation, differences between good and poor readers, variation with 
changes in subject matter read, etc. This eventually led to the practice 
of attempting to improve reading performance by training eye move- 
ments. The necessary assumption is that eye movements are important 
determinants of reading proficiency. Thus, if the retarded reader is 
trained to use eye-movement patterns similar to those which character- 
ize efficient reading, his reading proficiency would presumably improve. 
This emphasis upon the mechanics of eye movements tended to direct 
attention to peripheral factors as determinants of reading performance 
rather than to the central or underlying processes or perception, appre- 
hension and assimilation. 
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Methods. Techniques of training eye movements in reading vary. 
On the one hand are those, represented by Pressey (see Tinker, 108) and 
Sievers and Brown (90), who try to train subjects to pace eye fixations 
at three regular intervals along a line of print. It is true, however, that 
Sievers and Brown suggest that eye-movement training is but one small 
part of a reading program. Nevertheless it is naive to assume that if a 
line of print is separated into three groups of words, that paced eye 
movements will result in reading each group with one fixation. In fact. 
Dearborn and Anderson (27) have demonstrated that such an arrange- 
ment results in more than three fixations. Buswell (IS) also notes that 
few subjects ever achieve three fixations per line. At the other extreme 
are those who attempt to increase the fixation span by employing tech- 
niques which approximate the normal reading situation. Thus, Dear- 
born and Anderson (27) have devised a flexible film projection method 
for teaching phrasing to increase the size of fixation span. Approxi- 
mately the same principle is involved in the still-film projector method 
developed by Buswell (IS) to increase span of recognition or fixation 
span. Successive phrases of text are exposed in proper sequence and 
spacing to lead the eye along in normal perceptual sequences. Timing 
and nature of textual material arc flexible in both techniques. Between 
these extremes a variety of techniques have been employed to train 
eye movements. The most commonly used one is the metron-o-scope 
designed to develop ‘‘controlled reading.** It is described by Taylor 
(lOS). This is a mechanically operated machine for exposing each line 
of print in three successive sections. Taylor (105) states that the funda- 
mental idea of the metron-o-scope is to develop the mechanical, i.e. eye 
movement, and interpretative processes simultaneously. As noted 
above, such techniques do not reduce fixations to three per line. 

Controlled reading. A large number of reports describe training of 
eye movements or controlled reading. Since Anderson (1, 3) Sisson (95), 
Kottmeyer (52), Nelson (74), Pilant (85) and Traxler (117) have evalu- 
ated or reviewed much of this material, detailed notice to the separate 
studies will not be given here. The experimental reports on controlled 
reading include studies byBroom (10, 11), Dearborn and Anderson (27), 
Dodson (28) Garver and Matthews (32), Hallock (39), Laucr, Henry 
and Fritz (57), Lee (59), Robinson (86), Simpson (93), Taylor (105), 
and Witzeman (125). Although some of these studies were done without 
adequate controls and lacked acceptable statistical treatment, they are 
rather uniform in showing considerable improvement in reading due to 
the training. Three studies in this area will be surveyed in more detail. 
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Sisson, (96), using matched groups of adult readers found that reading 
with intent to improve produced significant changes in eye movements 
comparable to the changes from eye-movement pacing. In a more 
elaborate study, Cason (16) employing matched groups of school chil- 
dren, found no significant differences in improvement in reading for the 
students trained in phrase reading or in eye movements by use of the 
metron-o-scope in comparison with those who did motivated library 
reading. That is, free library reading resulted in the same gains as pro- 
grams stressing the oculomotor mechanics of reading. All made signi- 
ficant gains. Buswell (15) employed a film projector for controlled prac- 
tice to increase the fixation span or span of recognition. This resulted in 
improved reading proficiency which was reflected in changed oculo- 
motor patterns. Emphasis was upon apprehension of broader percep- 
tual units so that the reader would be freed from the requirements 
of detailed visual perception. This permitted greater attention to 
meanings. 

Evaluation, Eye-movement training may be evaluated as follows: 

1. Many so-called procedures for training eye movements or for controlled 
reading result in improved reading efficiency. 

2. The improved reading status is reflected in modified oculomotor patterns. 

3. The improvement obtained by eye-movement training, with or without 
elaborate apparatus, is no greater than that resulting from motivated reading 
alone. 

4. There is no adequate evidence that training eye movements as such im- 
prove reading (107). Experiments concerned with pacing eye movements and 
controlled reading usually involve other techniques and are never divorced 
from increased motivation. Buswell (IS) flatly states “training eye movements 
does not increase reading ability.” 

5. The training of eye movements too often becomes a ritual and tends 
toward an overemphasis upon oculomotor mechanics to the sacrifice of adequate 
attention to the more important processes of perception, apprehension and 
assimilation. This training may result in a decrease in the flexibility and 
adaptability of reading habits which characterize good readers. According to 
Buswell (15), “The exploiting of machines and gadgets” to control reading “by 
persons who do not understand the psychology of reading seems at present 
to be adding greatly to this mechanistic folly.” And Traxler (117) has “dif- 
ficulty in seeing any justification for purely mechanical attempts to train pupils 
in better eye movement habits.” 

6. The reviewer agrees with Traxler (117) that controlled reading tech- 
niques, i.e., training eye movements, “need not be used in a purely mechanical 
way but that they might well be used to supplement, in a way that would have 
strong motivating force,” a program in which reading comprehension is em- 
phasized. The reviewer believes, however, that as long as gadgets are used by 
those with an inadequate understanding of the psychology of reading, we will 
continue to have the undesirable emphasis upon oculomotor mechanics. 
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Central Versus Peripheral Processes 

Problem. Training of eye movements is intimately related to the 
problem of whether eye movements are causes or symptoms of reading 
proficiency. The view that eye-movements are causes of reading pro- 
ficiency is based upon the discovery that when there is ineffective read- 
ing, there are many fixations and regressions and that good readers use 
few fixations and relatively few regressions. This trend developed until 
the point was reached where it was claimed that the good reader is good 
because he uses few fixations and rarely makes a regression. Thus, to 
develop a good reader presumably it is only necessary to develop effec- 
tive oculomotor habits. This gradually led to a strong emphasis upon 
peripheral factors, especially among the large group of writers with an 
inadequate knowledge of the psychology of reading. A few writers, in- 
cluding this reviewer (107, 108, 111) began to question this emphasis. 
It was pointed out that the more important determinants of reading 
proficiency were the central processes of perception, apprehension and 
assimilation. These processes are reflected in oculomotor patterns be- 
cause eye-movement habits are very flexible and appear to adjust them- 
selves readily to any change in the perceptual processes involved in 
reading. It would appear that eye movements merely reflect efficient 
or poor reading performance rather than cause it. 

Recent evidence and conclusions. There is really no evidence to sup- 
port the view that eye movements determine reading proficiency. As 
shown in the above analysis, the results obtained from training eye 
movements do not support the view that peripheral factors are dominant 
although based upon the assumption that they are. On the other hand, 
there are numerous reports which indicate that the central processes are 
the important determinants in reading performance. Analysis in which 
the writers conclude that eye movements reflect the central processes of 
perception, apprehension and assimilation include: Anderson (1, 3), 
Traxler (117), and Sisson (95, 97). A variety of recent experimental 
studies of eye movements have led to the conclusion that oculomotor 
patterns reflect poor and efficient reading, not cause it. These include: 
Anderson (2), Fairbanks (/l), Knehr (51), Rogers (88), Sisson (94), 
Swansojj and Tiffin (104), and Walker (120), and Walker and Molish 
(121). For earlier studies see Tinker (107, 108). Changes in eye-move- 
ment patterns occur when difficulties of comprehension arise, with varia- 
tion in reading purpose or attitude etc. It is now well established that 
oculomotor reactions are exceedingly flexible and quickly reflect any 
variation in the central processes of perception, apprehension and 
assimilation. 



114 


MILES A, TINKER 


Evaluation. Careful consideration of the field would probably indi- 
cate that there need be no conflict in assigning proper roles to peripheral 
and central factors in reading. To read, one must perceive words in 
proper sequence and to do this the eye fixations must progress in appro- 
priate patterns. In other words, oculomotor patterns are a means of 
achieving the perceptual sequences necessary for apprehension and 
assimilation of the textual material. The particular eye-movement pat- 
tern employed is conditioned by the nature of the central processes. 
Clear apprehension and rapid assimilation of the textual material is 
reflected in relatively few fixations and regressions. But ineffective per- 
ception and confused apprehension, accompanied by difficulties in as- 
similation, characteristically produce many fixations and regressions. 
The latter, however, should not be confused with the oculomotor pat- 
tern occurring in highly analytical reading where it is necessary to re- 
examine the text and to v/ork out relationships. In such a case, as 
shown by Tinker, Frandsen and others (see Tinker, 107), the complex 
eye-movement pattern may indicate effective reading rather than in- 
efficient central processes. It seems possible, therefore, that the un- 
fortunate dichotomy of central versus peripheral processes might be 
avoided to some degree by omitting use of the terms eye-movement 
patterns and rhythmical eye movements, and by employing instead the 
concept of perceptual sequences. The latter is more meaningful in 
discussing the reading process and it avoids the unfortunate mechanical 
implications of the former. 

Summary Statement 

Important progress in the study of eye movements in reading has 
been made during the past ten years. Many of the investigations have 
been of high quality. Nevertheless unfortunate trends have appeared. 
Many unqualified individuals, those without an adequate background 
in either experimental procedures or the psychology of reading have been 
doing ^‘research” in eye movements. The results arc deplorable. Fur- 
thermore, the exploitation of the eye-movement technique has ac- 
centuated the unfortunate emphasis upon peripheral determinants of 
reading proficiency and has led to wide use of eye-movement measure- 
ment in the clinical situation although the eye-movement camefti should 
be considered a research rather than a clinical instrument. There arc 
signs, however, that the conclusions from sound research and the views 
derived from adequate analysis of results in the field are beginning to 
make headway. 
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COLOR ADAPTATION TO 1945* 

JOZEF COHEN 
Cornell University 

It is the purpose of this paper to present a critical history of the 
phenomenon known to workers in color as color adaptation, color 
fatigue, or color minuthesis. 

Although usually attributed to later investigations, the first scien- 
tific observation of color adaptation appears to be that described by 
the French mathematician and physicist De La Hire (12, p. 289) in 
1694. His was a description of observations of the sun, the course of 
perceptual change, and the resulting affer image. He ventured an 
hypothesis concerning the fatigue of visual receptors, and although it is 
somewhat erroneous in view of later investigations, it is character- 
istically the Hering theory. 

The first important investigation was the 1863 contribution of 
Maria Bokowa (5). Bokowa wore copper oxide glasses of 1 mm. thick- 
ness from four to five hours. The glasses had a transmission which 
was predominantly red, but they could not have been highly selective 
since she was able to see colors in all parts of the spectrum. The experi- 
ment was controlled, in a manner, since a spectrum was viewed from 
time to time during the course of adaptation and comparisons made 
from one observation to another. Further, there was an attempt at 
quantification by viewing a color wheel and describing the perceptual 
color which corresponded to a physical stimulus resulting from an 
arbitrary number of circular degrees of arbitrary “primaries.** In gen- 
eral, Bokowa’s results may be summarized as follows: 

1. After color adaptation had taken place, all colors appeared as either yel- 
low or blue. 

2. The brightest color was a mixture of yellow and blue. 

3. The cut off point of the visible spectrum shifted toward the short end 
of the spectrum. 

It is to be noted that nowhere does Bokowa indicate that the adapta- 
tion phenomena continued to a neutral gray although Helson and Judd 
(19, p. 383) refer to Bokowa as having reached complete adaptation. 
Bokowa (5, p. 163) refers to red as appearing less intensive during 
adaptation and this indicates that some change in intensity may have 
taken place. She may, however, have been referring to a loss in satura- 

* From a doctoral dissertation The Color Adaptation of the Human Eye directed by 
Professor "H. P. Weld. For an additional paper based on the same dissertation, see Cohen 

( 11 ). 
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tion although the word saturation does not appear in the paper. 
Bokowa, then, reports changes in hue, and makes references to changes 
in intensity. 

In 1865, the phenomena was noted by Aubert (3) in his observations 
of red and blue colored squares against achromatic backgrounds. The 
loss of saturation was referred to as (3, p. 131) ‘‘Adaptation der Netz- 
haut fur Farbcn.** He adds that adaptation for blue light is faster than 
adaptation for red light. It is interesting to note that Sheppard (33, p. 
35) says, “Nowhere in his writings does Aubert connect this term 
[adaptation] with color-changes, for which he uses the term ‘fatigue’.*’ 
The use of the term adaptation for both white and colored light ap- 
parently did originate with Aubert. 

Sigmund Exner’s experiment (14) in 1868 was the first which used a 
spectrum as adapting and reacting stimuli. His experimental procedure 
was to throw a spectrum on a screen. Slits were placed in the screen 
so that the light of the adapting color came through one slit while the 
light of the reacting color came through the other. Exner was, of course, 
studying the effect of the after image as well. By observing at a distance 
the beam of light which penetrated the first slit, he adapted a small 
area of the retina. The adaptation time was ten seconds. The eye was 
then brought very close to the second slit which had the effect of 
stimulating a much larger area of the retina. And in this larger area 
there appeared the after image of the light stimulus of the first slit. 
Inspection of Exner’s table shows the following results: 

1. When the stimulus color and the adapting color arc identical, the adapted 
portion of the retina is always lower in saturation than the non-adapted portion. 

2. The adapted portion of the retina perceived all colors as red, green, or 
blue. 

With respect to this second result, Exner attempted to locate the 
spectral position of these basic colors whose hue docs not shift during 
adaptation and toward which all other hues shift. For this purpose 
he devised an instrument utilizing a telescope and a spectrum. His re- 
sults are as follows: 

1. For red the position is from the long end of the visible spectrum to a point 
between C (656 mfi) and D (589 mu), perhaps somewhat closer to C. 

2. For green the location is between E (527 m/x) and b (517 m/x). 

3. For blue it is in the immediate environment of the line G (431 m/x). 

One of the few early studies directly concerned with the changes in 
intensity during color adaptation was performed by Schon (31) in 1874. 
Schon was not interested in adaptation phenomena as such but rather 
in the physiological mechanism to which discrepancies among previous 
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experimenters who had determined the differential sensitivity of the 
retina to the different portions of the spectrum could be attributed. 
Schon suggested that the differences between the many investigators 
might be due to the color adaptation of the retina. 

Schon used prisms to obtain spectral stimuli. These stimuli were 
presented to an observer through a tube which was divided in half 
forming a stimulus and comparison field. The intensity of each half 
was controlled by a variable slit, the width of the slit being a measure 
of the intensity. The experiment was begun by making the slits equal 
on both halves of the tube. The comparison field was shielded from the 
view of the monocular observer while the stimulus was presented for 
3, 5, 10, or IS seconds. At the end of the interval the comparison field 
was presented to the subject who stated whether they were* equal in 
intensity. If they were not judged to be equal, the slit of the comparison 
field was changed and the determinations continued until they were 
judged to be equal. Three stimulus colors were used — red, green, and 
blue. Schon came to these conclusions: 

1. After five seconds exposure to spectral red, green or blue, one half of 
the brightness is lost. 

2. There is a further decrease in brightness which takes place gradually. 

3. This further decrease is greatest for blue and least for red. 

Many criticisms have been leveled at this experimental procedure 
and consequently the results. His two test fields were not strictly com- 
parable because of the construction of the apparatus. He ignored the 
differential sensitivity of the retina to various wave lengths in com- 
parisons of heterogeneous wave lengths. For other criticisms and an 
alternative procedure which avoids most of the defects, see Almack (2, 
p. 27 ff). 

Ssaniujlow’s 1889 experiment is only reported in abstract form (34). 
So meager is the information available that it is difficult to determine 
the exact nature of his investigation. Ssainujlow considered the dura- 
tion of the after image as a measure of the adaptation of the retina, a 
theory which may not be strictly true. Ssamujlow docs not adequately 
explain what he means by adaptation, but presumably he refers to loss 
in saturation. Using spectral light sources, he arrived at the following 
results: 

1. The retina does not adapt equal amounts for different lights. It is most 
for red, less for green, and still less for blue. 

2. The degree of the adaptation is dependent on the time and the strength 
of the stimulus. 

3. The peripheral retina adapts to a lesser degree than the fovea. 
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There are two contributions to the literature by Carl Hess. The first 
(20), published in 1890, describes his experimental procedure. A tele- 
scope was arranged in conjunction with spectroscopes so that there ap- 
peared in the telescope a left and right field, either of which could 
be illuminated with monochromatic light. The field on the left was a 
square while the field on the right had the form of a circle. The telescope 
could be shifted so that either the right or left field came into view. 
The observer’s eye was placed at the telescope and fixated the right 
edge of the left square field. After a given interval the telescope 
was shifted to the right field with the cross hairs in the center of the 
circle. Thus only one half of the adapted retina is covered with the new 
fixation, so that changes w^iich occurred by virtue of the adaptation 
process are apparent by comparison of the fields. Nine different homo- 
geneous lights and two non-spectral lights made by mixing spectral 
red and spectral violet were used as both adaptation and reaction 
stimuli. Changes in hue were noted for all combinations of lights taken 
two at a time ; each light was used as an adaptation light and matched 
with all other lights as reaction stimuli. There were two adaptation 
times — 10 seconds and 35 seconds. 

The experiment was continued in 1893 (21) and here the apparatus 
was modified to give results in terms of wave length. The spectroscope 
was fitted with two adjustable slits; through one slit passed the adapt- 
ing light while the reacting light passed through the other. The other 
spectroscope was fitted with a single adjustable slit. The telescope was 
not kept in a fixed position. The left half of the telescopic field could be 
filled with either the reacting lightj|or the stimulus light while the right 
field was filled with light from the adjustable single slit of the right 
spectroscope. The observer fixated a point between the two fields for 
50, 70, 75, or 180 seconds. The left field was illuminated with the 
stimulus light while the right field showed blackness. At end of the 
adaptation time the stimulus light was shut off and the reacting light 
and the comparison light presented simultaneously. The comparison 
light was readjusted after successive trials until it matched the percep- 
tion of the left field. Hess reaches these conclusions: 

1. There is a tendency for colors to appear as either blue, green, oc,fed to the 
color adapted eye. This is only a tendency and the data present an erratic ap- 
pearance. Hess’ cursory attempt to locate the wave lengths of these colors using 
only three colors as stimuli (having theoretical implications in the Helmholtz 
theory) found them to be 490 m/x for blue, 565 mpt fdr green, and 650 m/x for 
red. These figures are quoted erroneously by Almack (2, p. 26) and Sheppard 
(34, p. ff). To be sure, the basic colors are not well identified, but the data 
do indicate a red basic. 
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2. The shift in hue increases with time. There is no reference to saturation 
in this paper. 

Voeste's 1898 experiment (45) is one of the keystones in the history 
of color adaptation. His apparatus was the Helmholtz color mixer (18, 
p. 395 ff) as described by Konig and Dieterici (22). Essentially two fields 
were presented to the subject. The observer monocularly fixated a 
point between the left stimulus field and the right comparison field. The 
stimulus field was illuminated with a monochromatic source while the 
comparison field was unilluminated. After ten seconds, the observer 
matched the right field with what was seen on the left. It was matched 
for hue by variation of the prismatic surfaces and for intensity by varia- 
tion of the slit openings, while saturation was completely disregarded. 
The stimulus fields covered the visible spectrum at three or four dif- 
ferent levels of intensity. Voeste*s results may be summarized as 
follows: 

1. All colors tend to shift in hue toward three basic hues, 560 mju, 494-498 
mju, and 460-470 m/x. Subsequent experiments using these wave lengths as test 
patches show no shifts in hue. Troland suggests that these changes are due only 
to the Bezojd-Briicke effect (42, p. 210 or 43, p. 180). 

2. The comparison field is from 18.9% to 83.3% of the intensity of the 
stimulus field. These figures are in terms of slit openings and, of course, fail to 
take account of the differential sensitivity of the retina. 

In 1899 and 1900 Burch wrote two papers on the production of 
artificial color blindness. The results of the experiments described in 
these papers are so unusual that extreme caution needs be exercised in 
interpretation. In the first paper (8, p. 2) Burch wrote, “[The method] 
consists of exposing the eye to bright sunlight in the focus of a burning 
glass behind a transparent screen of the proper colour, and keeping it 
there until all sensation of that particular colour is lost.” This must 
mean, it appears, that adaptation goes to a neutral gray. Burch con- 
tinues (8, p. 3), “With a combination of ruby glass ... it takes from a 
few seconds to two or three minutes to produce complete red blindness.’* 
It is not clear as to whether he means the perception was of red or of 
neutral gray. He does claim that for an eye stimulated as described 
above, the sensation of red is not present, and the eye is completely 
blind to that hue. If this is the case, adaptation not only reaches a 
neutral gray, but the eye remains in that state for some time since its 
perception of red would be similar to its perception of gray. This con- 
dition, Burch continues, does not in any way affect the perception of 
other colors having no red component. It has no effect on intensity at 
all. 
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In his second paper (9) Burch attempted to show that the effect was 
not pathological since it could be produced by moonlight. He wrote 
(9, p. 217), “In producing red-blindness, as there are no colors to the 
left of red, I have generally used a part of the spectrum which excites 
only the red sensation, and which therefore must continue to appear 
visible at all. But I have not thought it desirable to push the fatigue of 
the retina far enough to destroy this sensation of light.** Of course, 
the spectrum coiild appear as gray and still be visible. 

Beck (4) observed the effects of intense illumination on colors and 
noticed the adaptation effect. He continued his experiments with spec- 
tral sources and reported that the colors were not visible as hues or 
were desaturated, depending on the degree of adaptation. 

Porter and Edridge-Green (29) adapted the eye with a band of color 
for twenty seconds. Immediately after the adaptation period the eye 
was shifted to view a spectrum larger than the band; the band filled 
only the center portion of the spectrum so that comparison could be 
made between the adapted and unadapted portions. Although the ex- 
perimental procedure is interesting, their results are erratic and their 
experiments cursory. They noted that the retina did not lose sensitivity 
for red, but sensitivity was lost for blue. In general their results follow 
no general form or law and there is no direct statement pertaining to the 
law of color adaptation. 

In 1913, however, Edridge-Green (13) contributed an additional 
series of experiments of more significance, two of which we report: 

I. After the eyes have been exposed to sodium light for twenty 
minutes, a spectrum is examined. The yellow has completely disap- 
peared from, the spectrum; the green and red meet without any inter- 
mediate color. But the red, orange, and green have not lost any of the 
yellowness which they previously possessed. 

II. Blue-green spectacles were worn for ten minutes. At first all 
white objects appeared as a vivid blue-green, but at the end of ten 
minutes* time, a white object appears white. On examining the spec- 
trum with these glasses before adaptation, there was no red to be seen, 
there were small amounts of orange, and the blue, green, yellow, and 
violet were visible. After adaptation, there were no marked changes in 
the orange or any other color, except the green which took on a paler 
and more yellow appearance. The region of the spectrum corresponding 
to the dominant region of the transmission of the glasses was most 
affected by adaptation, while the blue and yellow on each side appear 
bluer and yellower respectively. 

The conclusions follow: 

1. No color is seen by the color adapted eye unless the corresponding physi- 
cal stimuli are present in the light reaching the retina. Color adaptation pro- 
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duces its effect by subtraction, and not by the addition of any new color sensa- 
tion. 

2. Color adaptation increases with time. 

3. The dominant wave length corresponding to the adaptation stimulus 
color of spectral regions appears colorless. Colors immediately on either side of 
the dominant wave length are shifted higher and lower respectively. 

The experimental apparatus used by Briickner (7) was Helmholtz’ 
Spiegelapparai (18, p. 161). This was essentially a piece of plateglass 
placed perpendicular to the table. The observer is able to look through 
the glass to the surface of the table beyond it, or to see the reflection of 
the near surface of the table, or both, depending on how the glass is 
covered. Briickner placed an adapting stimulus composed of two parts 
— one half chromatic and the other half achromatic — at the far end of 
the table. All of the colors were from the Hering series. The chromatic 
colors covered the visible spectrum and are indicated in Angstrom units. 
The chromatic half was either white, gray, or black. E^ch hue appeared 
three times — once with each of the achromatic colors. The reacting 
stimulus, placed on the near table, was one of the seven hues used as 
one half of the adapting stimulus or one of six additional colors. The 
experiments were carried out binocularly using daylight. The individual 
fixated the adapting stimulus through the plate glass. After ten to fif- 
teen seconds the apparatus was adjusted so that he observed the re- 
acting stimulus as a reflection. The results are expressed in terms of 
whether or not the reacting stimulus is more effected by the previous 
adaptation to the chromatic or achromatic stimuli. Since Bruckner did 
not report for the case when the reaction color was the same as the 
stimulus color, no conclusion may be drawn except that changes in 
saturation do occur. 

In 1920 Hubert Sheppard (33) made the first systematic investiga- 
tion of color adaptation. His paper presents four main experiments: 

I. Five observers who fixated in direct sunlight seven large sheets 
of Hering colored paper which covered the range of the visible spectrum, 
described the course of adaptation during fixation. In general, they re- 
port either no hue change or a hue change to yellow, wdth no change in 
intensity. The loss in saturation proceeded to a neutral gray in from 
45.9 seconds to 220 seconds. When the subjects observed 180° of the 
same colors mixed with 180° of black and white on a color wheel, com- 
plete adaptation took place in from 23.4 seconds to 121.3 seconds. 

II. In the second experiment, circles of the Hering papers placed in 
a field, of Hering gray were illuminated by a 75 watt bulb. By means of 
a telegraph key attached to a kymograph the observers signaled when 
important perceptual changes occurred and also when uncontrolled eye 
movements occurred. They did not give a running account of the 
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adaptation process. In general, adaptation was not characterized by 
shifts in hue as in the first experiment nor was complete adaptation 
reported by all subjects. All observers, however, reported loss in satura- 
tion and told of a cloud-like fog which moved over the color. The adap- 
tation times (times at which no further change occurred) ranged from 
40.3 seconds to 120.2 seconds. Sheppard determined the chromatic 
limens for each of the stimulus hues by the method of limits using a disc 
containing black, white, and the hue. The limen has a negative cor- 
relation with the adaptation time. Wherever the adaptation time is 
long, the chromatic limen is small. 

III. In this series a remodeled spectrophotometer so designed that 
the subject, observing through an artificial pupil, was able to perceive 
a small portion of the spectrum as a disc, was used. Five positions were 
chosen from the visible spectrum. All but a single observer reported 
complete adaptation. The time ranged from 76.6 seconds to 169 seconds. 
Since Sheppard noticed that these times formed a curve which closely 
resembled the brightness curves for spectral colors, he performed an- 
other series in which the colors were equated for brightness value. The 
resulting figures, although more similar than those obtained before the 
brightness values were equated, still showed another factor to be oper- 
ating. This, Sheppard reasoned, was the chroma value already demon- 
strated as being related to the adaptation time in his second experiment. 

An additional series was conducted in which the chroma values were 
made equal (by subjective techniques). It does not appear that the 
brightness values were kept constant as well. His apparatus was^ too 
crude to control both factors at the same time. The adaptation times 
(for complete adaptation) for the various colors were now close enough 
so that Sheppard considered chroma as being the chief determiner of 
adaptation times. Introspective reports indicated no hue change except 
for red and violet which shifted toward yellow and blue respectively. 
1 here was no reported change in intensity. 

IV. The last experiment was an attempt to determine whether 
complete adaptation times could be reached with very high intensities. 
A carbon arc provided the light source while gelatin filters were placed 
over a pair of glasses. The experiment was monocular. The subject 
fixated the arc while wearing the glasses until no further change took 
place. He then fixated a spectroscope which was equipped with a power- 
ful illuminating source and a rheostat to control the intensity. (The 
rheostat no doubt changed the color temperature.) This served as a 
comparison light. The subjects reported complete adaptation in from 
105.3 seconds to 197 seconds. There was some shift in hue toward 
yellow but most subjects reported no shift in hue at all. As Almack 
points out (2, p. 20 ff), Sheppard’s description of this portion of his 
experiment is not clear. It is difficult to determine the nature of this 
comparison light formed by the spectroscope. 
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Troland (41) has reported the results of what he terms photopic 
adaptation. These are the same experiments referred to in his previous 
papers (37, 39, 40). The eleven experiments will be described: 

I. In the first experiment, designed to test the equilibrium sensation 
for direct sunlight on white paper, a large white paper was fixated in 
direct sunlight for ten minutes. He concluded that prolonged exposure 
of the retina to large, very intense, achromatic stimuli does not involve 
a reduction of the sensation to neutral gray. The luminosity of the per- 
ception after adaptation is much greater than that of mid-gray. Fur- 
ther, the equilibrium sensation just described is brought about rapidly. 

IL In this experiment large sheets of colored paper, selected to cover 
the range of the visible spectrum, were observed in direct sunlight. From 
these experiments he drew the conclusions that after exposure of the 
retina to large chromatic stimuli of high intensity, the resulting percep- 
tion is not one in which the hue of the stimulus has completely dis- 
appeared. There may be some shift in hue but the perception is never 
one of neutral gray. The equilibrium point is reached soon after fixation. 

III. A dark gray square, placed in the center of a yellowish-green 
Hering paper, was fixated until the green component of the paper faded 
leaving a light blue. At the end of four minutes a shutter arrangement 
substituted a Hering violet paper for the gray square. The outlying 
field became a darker blue while there was no reappearance of the 
green. But on the introduction of a white object into the green adapted 
area, the green color returned. Additional tests indicate that this phe- 
nomenon is not due to the contraction of the pupil. The experiment, of 
course, lends support to the theory of central control of color adaptation. 

IV. Observations similar to those of Experiment I were made with 
stimuli consisting of two juxtaposed Hering papers of different hues. 
One paper filled the upper half of the field while the other filled the 
lower half. The six possible combinations of the four Urfarben were 
employed. Troland concluded that the principle of simultaneous color 
contrast applies to the equilibrium position. The introduction of a 
neutral object into the bi-chromatic field brings about the same change 
in quality for both halves of the field which was described in the previous 
experiment for a simpler stimulus. An experiment using a bipartite 
fixation of black and white failed to show any tendency of the field to 
approach neutral gray. 

V. This experiment concerns the perception of a small luminous 
spot on a dark background. If the illumination is low, the spot tends to 
disappear and remain below the visibility threshold or fluctuate be- 
tween visibility and invisibility. As Marbc (25) demonstrated, the more 
intenfee the spot, the less is the tendency for it to disappear. This was 
confirmed by Troland, and with spots of colored light fixated monocu- 
lady at high intensities, observers reported that the stimulus colors 
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changed as in the previous experiments, but it did not approach a 
neutral gray nor did the spot fluctuate. 

VI. The fluctuations of the last experiment are explained by sup- 
porters of the Hering theory as being due to eye movements permitting 
the retina to recover. See, for example, Ferree (15). A large sheet of 
Hering paper was fixated. The paper was large enough to be well out 
of the field of possible eye movements. With steady monocular fixa- 
tions, repeated fluctuations in the field could be observed. Temporary 
disappearance could be obtained. 

Fluctuations described here cannot be explained on the basis of 
adaptation followed by recovery due to eye movement. Deviations in 
fixation would move the image on the retina through not much more 
than one fiftieth of its diameter. Changes in accommodation could not 
be responsible for fluctuations since such changes would only be on the 
edge of the image. 

VII. This experiment was designed to investigate whether the 
changes in the iris can be responsible for the visual fluctuations which 
the previous experiment showed were not due to eye movements or 
accommodation. The stimulus surface described in the previous experi- 
ment was perforated and a telescope placed behind. The iris of the 
observer was illuminated so that changes could be observed through the 
telescope. The observer pressed a telegraph key when the colored spot 
was seen to disappear. The pupillary contractions and expansions have 
a perfect correspondence with the reported fluctuations. 

VIII. If changes of the iris are responsible for fluctuative disappear- 
ances, it is obvious that the elimination of these changes by the use of 
drugs such as pilocarpin or a tropin or the use of an artificial pupil should 
cause the elimination of these fluctuations. Drugs have been used on 
similar problems by Pace (28) and McDougall (27). Both investigators 
report that fluctuations were present. For the effects of the artificial 
pupil, see Troland (38). The intensity of the colored spot stimuli was 
placed under the control of the subject. Starting with a very low inten- 
sity, he increased the brightness until no fluctuations occurred for three 
minutes. The stimulus was then regarded as being at the required 
threshold. The results indicate that if the action of the iris is eliminated, 
no fluctuations occur for long intervals. The fluctuations which do oc- 
cur, according to Troland, are due to mechanical difficulties of the 
artificial pupil, to the electrical condition of the visual system, binocular 
rivalry, etc. 

IX. Spots of various colors equated for brightness were fixated with 
the artificial pupil. Since the brightnesses were of moderate intensity, no 
fluctuations of the spot stimuli were observed. The adaptation did not 
proceed to a neutral gray. If the stimulus is a psychological primary, 
there is no essential shift in hue. Intermediate hues tend to lose the 
red and green components. 
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X. The subject observed a colored spot for twenty seconds. The 
intensity of the spot was then decreased until the subject indicated 
that it had disappeared. Fixation was kept constant by means of the 
bright crescents appearing at the edges of the adapted area. Reappear- 
ance of the whole area always occurred. 

XI. In the greater number of visual studies it is necessary to stimu- 
late but a single eye while the other is blindfolded. This is an ideal con- 
dition for binocular rivalry. To demonstrate this phenomenon, one eye 
observed a colored spot. Immediately after fixation, the stimulus was 
transferred to the other eye where it appeared at first in its proper color. 
But soon it became a sparkling white, reminding one of binocular luster. 
It is therefore suggested that the after iijiage appeared in the other eye 
and when mixed with its complement (the stimulus hue), it appeared 
white. 

Troland did not discover this phenomenon. It was first noted by 
Newton in a letter to Locke and is the subject of Titchener's thesis. See 
Titchener (35) or for a brief discussion in English (36 p. 49 ff). See also 
Franz (16, p. 44) for evidence that this effect does not occur. 

Troland and Langford reported in abstract form some experiments 
concerning chromatic minuthesis (44). Minuthesis is their term for 
adaptation. They contend that changes which occur are only changes 
in saturation. Any further changes in hue are due to the Bezold-Briicke 
effect. 

In 1923 Hamilton and Laurens (17) used a comparison technique to 
study adaptation. One eye was fatigued for thirty seconds. Then both 
eyes were placed at telescopes which presented a colored field to each 
eye. The fields were adjusted until a just noticeable hue discrimination 
was indicated. The results were as follows: 

1. Red, green, and blue have a selective effect when used as stimulating 
sources. All other colors do not. 

2. During adaptation brightness is reduced. 

One of the more important studies of chromatic adaptation was per- 
formed by Almack (2) in 1928. Her problem was to measure the loss 
of sensitivity of the color* adapted eye. Spectrum lights, used as stimuli, 
were controlled with respect to hue, saturation, and intensity. The 
threshold of the spectral light which was used to induce loss of sensi- 
tivity was determined at the end of each adaptation period. This was 
accomplished by the introduction of a motor rotated sectored disc into 
the path of the stimulus light. The rotating disc could be removed at 
will. Determinations were made after varying amounts of time (from 
2 to 300 seconds) and gathered together in the form of curves. Com- 
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parison could be made between different intensities of the same hue, 
but not between different hues since the thresholds for the different 
wave lengths have different values. Comparison is made, however, 
when the determinations are reduced to a percentage basis. The experi- 
ments were made with both a dark adapted and light adapted eye. 
Almack drew the following conclusions from her work: 

1. For stimulus lights made equal in intensity, the sensitivity of the non- 

adapted eye is from greatest to least, yellow, green, blue, red. Initial loss in 
sensitivity is greater than at any other period. Initial sensitivity is greater for 
the dark-adapted eye than for the light-adapted eye, but both sensitivities tend 
to become equal with time. The rate of adaptation increases with the intensity 
of the stimulus. * 

2. For lights equalized for perceived saturation of the stimulus, the rates of 
adaptation to the various stimuli are very unequal. The largest rate (2, p. 115) 
“of adaptation by no means occurs at that intensity of light which gives maxi- 
mum saturation to the color sensation. . . . The rate of adaptation is a function 
of the physical intensity of light, not of its subjective aspects, saturation and 
brightness.*’ 

3. For lights made equal with respect to wave length, the loss of sensitivity 
varies with the wave length. Under dark-adaptation, chromatic sensitivity is 
greater than for light-adaptation. 

There is some confusion as to just what is measured by the tech- 
nique described in this experiment. Consider a retina adapting to a 
stimulus light of a particular hue, saturation, and intensity. After a 
time, we determine the intensity limen of the eye to the same hue and 
saturation. This is true since we use the same source and introduce a 
rotating sectored disc. If we consider the color solid to be a double cone, 
then in dealing with hues of maximal saturation which lie on the surface, 
the saturation of such hues is dependent on intensity. They are de- 
pendent variables. Then if the intensity of our stimulus light is reduced, 
we have also reduced the saturation. On the other hand, if the color 
solid is considered to be a cylinder (and this is implicit in the ‘funda- 
mental sensation curves* of Kbnig and Dieterici (22) ) there is no re- 
lation between saturation and intensity. Under these conditions, meas- 
urements by the above sectored disc can only indicate changes in per- 
ceived intensity. 

Roaf (30), too, used the threshold of the eye as a measure of sensi- 
tivity. His apparatus was very complicated, but essentially it permitted 
the threshold to be determined after stimulation. He concluded that 
when any light stimulates the retina, the sensitivity, even of another 
part, i? decreased to the short wave lengths and somewhat less to the 
long. 

Kravkov (23) utilized the Helmholtz color mixer (18, p. 395 ff) in 
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1928 to test a theory of Lasareff concerning adaptation. The left field 
was fixated for ninety seconds and then matched with the right. The 
fixation was monocular. The match was made for intensity only. The 
results, claims Kravkov, support the outlined theory and indicate 
curves described by an equation with variable parameters. Kravkov's 
curves are, however, misleading. For example the first curve on page 93 
(23) would fit the observation points just as well if it were completely 
inverted, or, indeed, if just a horizontal line were drawn. The conclu- 
sions to which Kravkov came are the following: 

1. The decrease of intensity of a color sensation following color adaptation 
is of great importance and accords a lessening of the stimulus to 50% of its 
initial value. 

2. The course of adaptation depends not only on the intensity of the 
stimulus but also on the hue. At the same intensity, violet (451 m/i) is more 
quickly adapted than red (656 m/i) and less than green (550 mju). 

Helson and Judd (19) performed two important experiments in 
1932. The first concerned itself with eye movements and their relation 
to color adaptation. An apparatus was constructed so that an individual 
placed his head in a sphere 36 inches in diameter. The inside of the 
sphere was lined with orange-red Hering paper. The subject fixated the 
inside of the sphere binocularly. Strong illumination eliminated pupil- 
lary changes. In theory the apparatus was so constructed that any eye 
movements did not change the effective stimulus. Observations were 
made for from five to seventy-five minutes. The conclusions arrived at 
by Helson and Judd by use of their ‘adaptation sphere' are as follows: 

1. The neutral condition of gray did not appear permanently. Gray was 
perceived but its perception was cyclic and spasmodic. They think that this 
was probably due to the pupillary effect. 

2. The red component vanished making the perception more yellow. 

3. Most of the adaptation took place within the first few minutes of stimula- 
tion. 

In the second experiment, the subjects wore strongly selective red or 
green glasses (the transmissions are given in detail). The observer wore 
these glasses for as long hs five hours and at hourly intervals identified 
colored Hering papers for hue, saturation, and intensity. It is difficult 
to understand how a subject wearing glasses transmitting nothing less 
than 600 m/x could see the blue component in a blue paper. Note this 
report (19, Table 2, p. 388). The results are as follows: 

1, After five hours, the lighter colors appeared in the hue of the glassse 
while the darker colors were either neutral gray or complimentary in hue to that 
of the glasses (because of the after image). 
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2. At first there was a sharp loss in saturation as well as a shift in hue — a 
shift toward yellow. 

Attention should be called to the theoretical explanations in this 
paper. Helson and Judd assume that if complete adaptation did occur 
while the individual is wearing colored glasses, he would see everything 
in 'natural* colors. This, they point out, is impossible if the glasses are 
highly selective since adaptation cannot supply what is physically not 
present. Further, they continue, if the glasses are not highly selective, 
the objects may appear in 'natural* colors since the predominant wave 
length will cause quick adaptation to itself and since the glasses transmit 
at least some of all the wave jengths, objects will be seen in 'natural* 
colors. This last is, of course, quite possible. Such non-selective glasses 
might lead to such a result. But adaptation with highly selective glasses 
can occur, at least theoretically. In this case everything transmitted 
would eventually be perceived as a neutral gray; perceived as a gray 
physical stimulus would be perceived by a non-adapted eye. 

In 1934 Bouma (6) commented that when looking at a white surface 
with a black card at the center with sodium (or neon) light, the black 
part, after a time, took on a violet (or green) hue. His study, however, 
was more concerned with after images. 

The first approach to the color adaptation problem by use of twin 
colorimeters was by Wright (47) in 1934. The apparatus was that de- 
scribed by him in (46). It can be characterized by the following. There 
were two eye pieces each of which was connected to a Wright color- 
imeter. Any hue could be seen in either of these eye pieces by adjusting 
three knobs associated with each colorimeter. The knobs mixed three 
spectral primaries. All hues in high intensities but not all saturations 
could be produced. The third eye piece, to the right of the other two, 
presented a field of white light. The subject was dark adapted for a 
period of thirty minutes. He then fixated the fields of the two color- 
imeters; the left field was seen by the left eye while the right field was 
seen by the right eye. Binocular rivalry did not occur, according to 
Wright. The test patch of the right eye was matched with the com- 
parison field of the left. The calibration readings were noted by the 
experimenter. The subject then shifted his right eye to the eye piece 
showing wdiite light and fixated this patch for three minutes. The left 
eye, during this time, was unilluminatcd. At the end of three minutes, 
the observer again fixated the colorimeter fields as before. The fields 
no longer match. The observer quickly matches the left patch with the 
perception of the right. When the match is completed, he signals the 
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experimenter who notes the time and the wedge readings, and the sub- 
ject proceeds to make another match. The process is continued until 
no further change is noted. 

This attack of the problem is somewhat different from anything else 
we have encountered. Here, it is the recovery time after adaptation that 
is measured. And recovery is given in terms of the spectral primaries of 
the colorimeter. At the beginning the two fields of the colorimeter are 
the same. The right eye is stimulated with white light which, it is as- 
sumed, depresses (or adapts) the activity of the receptor for the test 
patch to a level different from the original perception. The eye is then 
allowed to recover to the level of the test patch. This recovery is sup- 
posedly measured with the left eye. What is not taken into account, 
however, is that the eye does not recover to the level set by the test 
patch but continues to be stimulated by it until the receptors reach the 
equilibrium position. It appears that the white light is a confusing ele- 
ment and we wonder that Wright did not adopt the experimental pro- 
cedure used in Cohen's (11) experiment. For additional papers using 
retinal recovery as a measure of adaptation sec Allen (1) and the bibliog- 
raphy included in that paper. 

Since the results of Wright’s experiment are given in specialized re- 
sponse curves, no interpretation can be given until they are mathe- 
matically transformed to a more widely used system of primaries. 

Color fatigue of the peripheral retina was investigated by Cogan 
and Cogan (10) in 1938-39. A perimeter was employed to hold the 
colors. The test color (Hering paper) was framed in a background hav- 
ing the same saturation as the test color. The time between the first 
perception of the color and its eventual merging into the background was 
considered to be a measure of adaptation. The results indicated the 
following: 

1. Color adaptation continues to a neutral gray. The rate of adaptation in- 
creases toward the peripheral field. 

2. Green was the most easily adapted color while red was the most difficult. 

3. Dark adaptation influences color adaptation times. 

There is a cursory study by Schouten and Ornstein (32) published 
in 1939. Their experiments utilize a separate field for each eye. They 
report that it was difficult to keep the two fields from fusing or over- 
lapping. The comparison could be made in brightness only. Their con- 
tribution is that the adaptation of a test patch (here loss in intensity) 
is dependent upon the state of the rest of the retina. 

‘Cohen (11) has recently completed an investigation in which he 
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utilized an instrument which presented two test patches to a binocular 
subject. The apparatus was so constructed that any hue could be placed 
in each patch by the adjustment of three controls. A test patch was 
placed in the right side of the instrument, a neutral gray on the left. 
After one minute the observer matched the left with what he saw on 
the right. The match being made, the left side is returned to gray while 
the right continues to adapt. After another minute, another match is 
obtained. In this way successive matches are made against time for a 
particular test patch. The instrument readings which are given in terms 
of monochromatic radiations of red, green, and blue (700 m/*, 546 m/x, 
436 m/x) are transformed to the monochromatic system of hue, satura- 
tion, and intensity. Thus quantitative changes in these latter variables 
were plotted against time. He did not employ intermediate colors. 

Under the conditions of Cohen’s experiment the following three lavs 
may be stated : 

Law L The course of color adaptation consists of a gradual but never 
complete loss of saturation^ with no change in hue^ and an increase in 
intensity. 

Law II. The rate and degree of desaturation depends upon the hue, the 
relative intensity, and the relative saturation of the stimulus color. 

a. The loss of saturation for a green stimulus is most rapid, it is 
less for red, and least for a blue stimulus. 

b. The less intense the stimulus, the greater the loss of saturation. 

c. The less saturated the stimulus, the more rapid is the initial rate 
of desaturation; the equilibrium position, or point of lowest desatura- 
tion, is independent of the saturation of the stimulus. 

Law III. The rise in intensity is dependent on the saturation and in- 
tensity of the stimulus and independent of the hue of the stimulus. 

a. If the saturation and intensity of the stimulus are both high or 
both low, the rise in intensity will be greater than if one is high and the 
other is low. 


Summary of Methods 

The methods used by the various investigators may be summarized 
as follows: 

I. Introspective — mere description of adaptation. 

De La Hire (12), Aubert (3), Ssamajlow (34), Sheppard (33), Tro- 
land (41;, Helson and Judd (19), Bouma (6) 
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II. Controlled introspective — perceived objects are carejnUy controlled 
(no comparison with unadapted portion of retina). 

Bokowa (5), Burch (8, 9), Beck (4), Edridge-Green (13), Sheppard 
(33), Helson and Judd (19) 

III. Limincd-limen (intensity) is measured after adaptation. 

Almack (2), Roaf (30) 

IV. Comparative monocular introspective — two fields viewed monocu- 
larly — differences are noted introspectively. 

Exner (14), Porter and Edridge-Green (29) 

V. Comparative monocular non-introspective — two fields viewed mo- 
nocularly — match made on instrument. 

Exner (14), Sch6n (31), Hess (20, 21), Voeste (45), Edridge-Green 
(13), Kravkov (23) 

VI. Comparative binocular introspective — two fields viewed binocu- 
larly— differences noted introspectively. 

Bruckner (7) 

VII. Comparative binocular non-introspective — two fields viewed bi- 
nocularly, one field for each eye — match made on instrument. 

Wright (47), Hamilton and Laurens (17), Schouten and Ornstein 
(32) 

VIII. Comparative duo-binocular non-introspective — two fields viewed 
binocularly, both fields seen by both eyes — match made on instrument. 

Cohen (11) 


Summary of Results 

The results obtained by the various investigators may be summar- 
ized as follows: 

I. Saturation 

Partial loss: Bokowa? (5), Aubert (3), Exner (14), Troland (41), 
Helson and Judd (W), Almack? (2), Roaf? (30), Cohen (11) 
Complete: Edridge-Green (13), Sheppard (33), Cogan and Cogan 
( 10 ) 

II. Intensity 

No change: Burch (8, 9), Sheppard (33) 

Loss: SchSn (31), Voeste (45), Almack (2) Hamilton and Laurens 
"(17), Roaf (30) Kravkov (23), Schouten and Ornstein (32) 
Increase: Cohen (11) 
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III. Hue 


Shift 



Basic Colors* 



R 

Y 

G B 

V 

Investigator 
Bokowa (5) 


X 

X 


Exner (14) 

656-589 


527-517 

431 

Hess (20, 21) 

650 

565 

490 


Voeste (45) 


560 

498-494 

470-460 

Sheppard (33) 


X 

X 


Troland (41) 

X 

X 

X X 


Helson and 

• 




Judd (19) 
Cohen (11) 

no shift 

X 
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THE EFFECTS OF NOISE* 

F. K. BERRIEN 
Colgate University 

The development of modern industrial society with its concentration 
of population and machinery in confined areas has brought with it a 
concomitant concentration of noise. The popular science literature of 
the past twenty years is dotted with articles describing the deleterious 
effects of noise in highly emotional terms unwarranted by the facts. 
Few problems have attracted a wider range of professional interests 
than the problem of noise and what to do about it. Physicians, public 
health authorities, architects, psychologists, otologists, physicists, sound 
and electrical engineers have contributed to the literature. On the in- 
dustrial side interest and concern is evidenced by the telephone and 
radio industries, automotive, airplane and railroad equipment designers, 
manufacturers of office equipment, building materials and a wide range 
of consumer appliances. Most recently the safety engineers and in- 
surance companies have expressed an interest because of the growing 
recognition of occupational deafness. One is reminded also of the vari- 
ous noise abatement campaigns which sprang into prominence in the 
early 1930*s. 

In the light of this wide spread interest, one might reasonably expect 
to find a considerable body of well grounded scientific literature relevant 
to the topic of noise and how it affects the behavior of man. On the 
contrary there is a paucity of studies dealing with the problem at the 
human level in terms which provide a basis for generalizations even 
though a wealth of unsystematized experience by acoustical engineers 
points to the conclusion that people are benefited by noise reduction. 
Part of the reason for this state of affairs lies in the fact that only within 
the past 20 years has there been any attempt to standardize terminology 
and techniques in the measurement of noise itself. Even today the 
available procedures are still in the developmental stage where many 
improvements are to be desired. 

It is the purpose of this review to summarize critically those studies 
pertaining to the effects of noise on human beings. Occasionally it may 
be necessary to refer to studies of animals to point up certain areas but 
in general the emphasis will be upon studies using human subjects. 

* This review has been prepared as a preliminary' step for a research program dealing 
with tjie effects of noise reduction on manufacturing personnel, sponsored by the Acousti- 
cal Materials Association. The author is especially indebted to Hale J. Sabin and 
Clarence W. Young for assistance in the preparation of this paper. 
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Consideration will be given in turn to the effects of noise on output speed 
and accuracy, fatigue and energy expenditure, the course of noise- 
adjustment, stimulation deafness, and finally the factors determining 
relative annoyance. Preliminary to an examination of these problems 
it is necessary to summarize the methods of measuring noise and their 
limitations. 

Definition of, and Procedures in Measuring Noise 

Noise has been defined as unwanted sound (60). This statement 
bearing as it does a heavy value judgment appears to be widely accepted 
among physicists, engineers and the like. Paradoxically the psycholo- 
gists lean toward a definition couched in more physical terms. Thus 
Dockeray (11) writes, “If a complex of vibrations is not in a harmonic 
ratio, we speak of the result as noise.” And Dashioll (7) agrees in saying, 
“When air vibrations are non-periodic and irregular, or are less than 
two full vibrations, they produce noises rather than tones," The present 
paper is primarily concerned with noise as unwanted sound recognizing 
at the same time that the degree of annoyance may vary from a theo- 
retical zero to some maximum point depending upon factors to be 
enumerated later. 

The unit of noise measurement most widely used is the decibel. This 
is a logarithmic unit based on sound energy measurements. Two sounds 
differ by N decibels when N = 10 logic Ei/E 2 . (Ei = sound energy density 
of sound 1; Ej* sound energy density of sound 2.) In practice it has 
been found that 1 decibel represents approximately 1 J.N.D. in a 1000 
cycle tone. The zero point on the decibel scale could be set at any value 
but has now been standardized at 10“^® watts per square centimeter 
which is a convenient value somewhat lower than the normal threshold 
of hearing for a 1000 cycle tone. Publications prior to the tentative 
adoption of standard accoustical terminology in 1936 were often con- 
fusing in their use of the decibel scale because the minimal auditory 
threshold varies with frequency of vibration. Early writers refer to a 
tone of 200 cycles at 30 db without specifying whether they mean 30 
db above the 1000 cycle tone threshold or 30 db above the 200 cycle 
threshold. 

Less frequently used in this country is the word phon which is 
numerically equivalent to a decibel when considering the standard 1000 
cycle tone. For any tone other than a 1000 cycles its phon value is 
determined by equating in loudness that tone with the standard. Re- 
ferring to Figure 1 this means that a 300 cycle tone having an intensity 
level of 40 db. above lO"^® watts per square centimeter will appear to 
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be equal in loudness to a 1000 cycle tone having a 30 db. intensity level 
above the same base* Hence the 300 cycle tone at that intensity has a 
loudness level of 30 phons. 

Related to this unit are the terms loudness units and sones. If a 1000 
cycle tone at 40 db. be taken as a standard and a second tone be reduced 
to one-half the subjective loudness of the standard, the second is said to 



Figure 1. Equal Loudness Contours (30) 


be one-half a sone. Conversely a tone adjusted so that the normal ob- 
server judges it to be twice as loud as the standard is said to be 2 sones. 

Stevens proposed this latter unit and has published curves (51) 
showing the relation of sones to db. level for a number of representative 
frequencies. The sone cprve for tones between 700 and 4000 cycles is 
almost identical with a curve of loudness units (L.U.) published by 
Fletcher (16) and subsequently adopted with slight modification by the 
American Standards Association (60). The L.U. curve was constructed 
on the basis of several independent studies conducted in this countiy 
and in England where estimates of various fractional and multiple 
loud-ncsscs were plotted against phons. The chief difference between the 
L.U. scale and the sone scale is in the magnitude of the numerical 
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values assigned, the L.U/s being roughly 1000 times greater than sones. 
Either scale is more indicative of the subjective effects of noise level 
changes than the db., phon or the direct intensity scales. Although the 
American Standards Association has specifically defined decibels, phons, 
and loudness units, it is common practice at least in speaking and often 
in writing to use the term decibel to cover physical sound intensity, 
equal loudness contours, or subjective loudness. It would be more 
meaningful to persons unfamiliar with acoustical terminology if phons 
were translated into either sones or loudness units. 

One practical way by which proper usage of these terms could be 
encouraged is by redrawing the equal loudness contours of Figure 1 (a 
figure that appears in almost every acoustical discussion) in terms of 
either sones or L.U.'s. As the figure stands the spaces between the 
contours do not represent equal increments in loudness. Stevens and 
Davis (51) have already made a step in this direction by assigning 
equivalent sone values to the contours as already drawn. The transla- 
tion into L.U.’s can be made by using Table I. 

TABLE I 


Equivalent Phon and L.U. Values 


Phon 

L.U. 

Phon 

L.U. 

110 

215000 

50 

2200 

100 

88000 

40 

980 

90 

38000 

30 

360 

80 

17100 

20 

100 

70 

7950 

10 

10 

60 

4350 




Figure 2 is an attempt to make the equal loudness contours more mean- 
ingful by translating phons into L.U.’s (X 10) using the table published 
by the American Standards Association (60) and then plotting the con- 
tours so that each successive contour going up the scale represents a 
doubling of the loudness. 

Two basic methods have been employed in measuring the loudness 
of sounds. The first method or older of the two is the binaural matching 
method with or without an off-set receiver. In this procedure a normal 
observer is provided with a receiver being driven through a calibrated 
attenuator by some kind of oscillating circuit. While listening to the 
sound <^0 be measured with one ear he matches the loudness in the other 
ear by adjusting the attenuator. The attenuator has usually been cali- 
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brated in terms of db. above the threshold for a 1000 cycle tone. A 
variation of this method employed in some of the earlier work was the 
use of a phonograph record of a warbling tone. The gain control was 
calibrated in decibels above the hearing threshold. 

Some question has been raised regarding the advisability of using a 
receiver, offset from the ear a centimeter or more, for the purpose of 
mixing the known and unknown sounds. Those favoring the use of an 
offset argue that comparisons are made more accurate because both the 



Figurk 2*. Equal Loudness Contours Expressed in Loudness Units (X 10) 


known and unknown sounds are matched in a single ear. On the other 
hand others (5) insist the offset has a baffle effect on the incoming waves 
from the unknown source distorting their loudness. Practiced observers 
following a rigid variant of the method of limits familiar to psycho- 
physicists have in this manner measured complex sounds with a fair 
degree of reliability. ‘‘The spread of the mean final settings normally 
does not exceed ±2 db. at 100 db. level, ±4 db. at 80 db., and ±6 db. 
at 50 db.“ (5). Some experience using a receiver without an offset while 

* In ’preparing Figure 2, loudness units were multiplied by 10. To reduce, neglect 
the final zero, i.e. 2,550,000 « 255,000 L.U.'s and so on. 
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matching the loudness of an unknown noise indicates that the position 
of the receiver is highly critical. Slight pressure on it or a small shift 
with respect to the meatus produces large changes in the apparent loud- 
ness of the calibrated sound. 

Using a different arrangement Geiger and Abbott (20) presented 
twenty observers with sounds from a variety of sources, (buzzer, rattler, 
vacuum cleaner, bell, etc.) by means of a microphone, amplifier, and 
loud speaker which were to be matched in loudness by a 1000 cycle tone 
introduced also through a loud speaker. The mean difference for all 
sounds and all observers between the sound level as measured by a meter 
and the matched tone was 2.4 db., the meter reading being between 76 
and 40 db. Generally the observers matched the unknown sound level 
with a 1000 cycle tone at a higher energy value than the sound level 
meter indicated. It is significant that Geiger and Abbott concluded, 
“This was a very severe test indeed, and the meter measurements were 
completely confirmed.” 

The second method employed in measuring sound levels is the objec- 
tive method in which a microphone is placed in the sound field and the 
electrical output of the microphone is amplified. An adjustable weight- 
ing network may be interposed between this amplified current and the 
indicating meter usually calibrated in decibels. The meter actually indi- 
cates A.C. power which is proportional to the rate of reception of sound 
energy at the microphone. The weighting network is adjustable so that 
the response of the instrument will approximate the response of the 
ear to the sound energies falling on it. 

Referring to Figure 1, it can be seen that between 30 and 1500 cycles 
at 100 db. energy intensity, loudness remains practically constant. 
However, if the energy intensity level is reduced to 40 db. the fre- 
quencies below 90 cycles would not be heard and those above 90 would 
vary from just perceptible to 40 phons in loudness. Consequently a 
noise meter must be adjustable so that at high intensities it gives a 
reasonably constant output for all frequencies and with low intensities 
it should give little or no response at low frequencies, but should increase 
its output as the frequency is raised to approximately 2000 cycles and 
then decrease as the frequency exceeds this level. Since it is not at 
present practically feasible to build a portable meter with networks 
permitting response curves similiar to each of the loudness contours 
most commercial sound-level meters are provided with three such net- 
works adaptable to levels approximating 40, 70, and 100 db. in over al! 
intensity. 

Under special conditions where the noise to be measured is made up 
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of a number of harmonic frequencies as is the case with the noise from 
large transformers the objective method gives readings that may be as 
much as 30 db. low^r than the aural comparison method (5). The 
reason for this lies in two related facts. First, frequencies close to, but 
not exactly equal to the harmonics of a given fundamental exert a 
masking effect on that fundamental thereby reducing the loudness which 
would otherwise be present. Second, when a number of harmonic com- 
ponents are present in the noise being measured their energies add to, 
rather than subtract from the loudness in a manner not provided for in 
the sound-level meters now available (56). The meters commercially 
available compensate for the masking effect of the near-harmonics but 
do not sum the sound energies of the harmonics as the ear does. Geiger 
and Abbott noted in the report referred to above (20) that in cases where 
the sound was composed of a large number of ‘'component musical 
notes, well separated in frequency and of nearly the same loudness” the 
individual ratings of loudness were considerably higher than the meter 
readings. 

Another limitation of sound level meters traceable to the micro- 
phone design is the wide tolerance permitted in their response char- 
acteristics by the American Standards Association. These are such that 
a 90 db. reading may have an error of plus or minus 3 db. On the surface 
such an error may appear small yet if one should take a low reading 
instrument to measure the sound level in a given area of approximately 
90 db., then return later after acoustical treatment with a high reading 
instrument, no change would be registered. Subjectively, however, un- 
practiced observers might judge the noise after acoustical treatment to 
be half as loud as before. Indeed, a change of 6 db. at that level is 
almost equal to doubling or halving the loudness as determined by a 
number of independent investigators using trained observers under 
careful laboratory conditions (19, 24, 26). In many field measurements 
of noise these errors are small relative to the fluctuation in sound energy 
from moment to moment and place to place. 

It becomes clear that the aural comparison method is superior to 
the objective method is s&me instances provided that competent ob- 
servers are employed, and the test tone is properly calibrated. The 
known variation in individual observers are of the same order as the 
allowable tolerances in microphone response characteristics. In addi- 
tion, however, the sound-level meters even if accurate in their response 
to the sound energy of the field do not register the non-masking effects 
of harmonic components in a sound spectrum. Undoubtedly one of the 
chief reasons for the wide use of sound level meters rather than the aural 
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comparison method lies in the fact that the meter readings are repeat- 
able with a high degree of reliability. Furthermore, it is difficult if not 
impossible for an observer to match sounds varying greatly in intensity 
from moment to moment. The aural comparison method does show dif- 
ferences in loudness when the same observer makes repeated observa- 
tions of the same sound. This is especially true of inexperienced ob- 
servers and leads investigators after a brief trial of the method to dis- 
card it. On the other hand the work of Geiger and Firestone (21), Ham 
and Parkinson (23), and Fletcher and Munson (17) showed that prac- 
ticed observers following carefully a standard procedure are remark- 
ably reliable in their estimations of loudness of relatively steady tones. 

It is possible to compute* the loudness of a complex noise by the use 
of a wave analyzer which records the magnitude of sound energy in 
each of several wave-frequency bands. The noise spectrum thus de- 
termined provides the basic data for computing the loudness using 
formulae developed by Fletcher and Munson (17). The method is 
cumbersome and yields results only slightly different than the mean 
value of several observers using a properly calibrated test sound. 

It becomes clear that both methods of measuring noise levels re- 
quires more than an ability to set up a microphone, flick switches and 
read dials. The meter readings must be interpreted in the light of the 
nature of the sound spectrum and the known tolerances of the meter 
itself. On the other hand the aural comparisons must likewise be con- 
ducted and interpreted by one familiar with the standard manner of 
listening and the known sources and magnitude of errors. In the hands 
of competent investigators either method will not give a measure of 
high precision but will provide a basis for an informed estimate of 
loudness. 

Effects on Production 

One may find in textbooks references to a number of reports indicat- 
ing that production is markedly increased with a reduction in noise. 
Several of these emanate from an anonymous article (62) in which 
references to the original reports are lacking. It is not possible, there- 
fore, to evaluate the claims that “lowering the noise level in a tele- 
phone-exchange room from 50 db. to 35 db. resulted in a forty-two per 
ceht reduction in errors and a three per cent reduction in cost per mes- 
sage’* (4). Or that “moving the assembly department of a temperature- 
regulator company from next to the boiler room to a quieter location 
reduced faulty assemblies from seventy-five per cent to seven per cent 
of those turned in for inspection" (4). 
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Only one field study in this country has been completed with suffi- 
cient control and covering a sufficient time to place confidence in the 
results. This study has not been formally published in detail although 
it has been referred to frequently (32, 59). The Aetna Life Insurance 
Company in December 1928 installed sound absorbing materials in an 
office occupied by typists, clerical checkers, punchcard and comptometer 
operators. Bonus figures were available reflecting the efficiency of em- 
ployees for a year prior to installation. This ''efficiency** refers to the 
allowed time for a given amount of work as determined by time studies, 
divided by the actual time taken. The average difference in the bonus 
based on efficiency of fifteen clerks taken at semi-monthly intervals for 
the year prior to, and a year after acoustical treatment was 9.2% in 
favor of the quieter condition. At no time during the "quiet** year did 
the bonus go below the level of the first year. No other changes were 
introduced into the working conditions during these two years. The 
sound level as measured by a Western Electric 3- A audiometer using 
the loudness matching method was 35 "sensation units*** after acoustical 
treatment. 

A control check on these findings was made after one year by cover- 
ing the sound absorbing materials with gypsum board. The sound level 
was then 41 "sensation units.** The bonus efficiency immediately 
dropped to an intermediate value between the two previous years, but 
within two months was approximately equal to the level of the "quiet** 
year. At the time the bonus reached this high level in the presence of 
the greater noise there was a reduction in work, only the more efficient 
employees being retained from then on, which probably accounted for 
the high average efficiency. Comparable, but less clear-cut results were 
noted among the punchcard, and comptometer operators. In these latter 
groups machine breakdowns and other uncontrolled factors compli- 
cated an interpretation of the data. 

The available data provides no basis for the calculation of statistical 
tests of reliability of the differences noted. In spite of this fact it is 
likely that the advantage noted in favor of the quieter condition is not 
a matter of chance. 

The influence of noise in school children and grade school work was 
investigated by Obata (39). The noises were a variety of phonograph 
records supplemented by a mechanical rattler varying between 60 and 
85 db. With each t} pe of sound working speed was more affected than 
accuracy in arithmetic and symbol cancellation. The adverse effects 

* This is a unit roughly equal to a decibel and used prior to 1936. 
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were not solely related to intensity since certain soft records were as dis- 
tracting as the rattler. 

At the Waverly Press, Baltimore, a number of changes were made in 
working conditions in 1935, among them being an acoustical installa- 
tion. It was reported that, ‘The reduction in noise and the use of rest 
periods are found to be useful means of producing not only higher effi- 
ciency from the standpoint of management but also a much more agree- 
able situation for workers themselves” (48). 

In an attempt to duplicate the industrial production situation Laird 
(29) devised an ‘‘experimental factory” in which the subjects were re- 
quired to touch a metal plate with an electrical stylus through 10,500 
small holes per work spell as they appeared in a moving tape. The holes 
were irregularly spaced and of irregular size appearing in a window 3J 
by inches directly in front of the worker. Production of four subjects, 
throughout each of the 4^ hours work spells was recorded by electrical 
counters. The results presented in the form of the number of holes 
missed per working spell per worker showed almost a straight line 
drop from 400 holes missed at 90 db. to 300 missed at 40 db. when the 
complex noise delivered through an amplifier by the Western Electric 
3-A audiometer was presented. Using pure tones at loudness levels equal 
to a 512 cycle tone at 60 db. on the Western Electric 2-A audiometer, 
errors increased slightly as the frequency was stepped up from 64 to 
512 cycles. Beyond this frequency, errors increased more rapidly from 
270 at 512 cycles to 325 at 4096 cycles. It is to be noted that the complex 
noise especially rich in high frequencies at comparable loudness levels 
affected production more adversely than the pure tones. 

In evaluating this study one wonders to what degree the increase in 
errors both with increases in frequency and loudness might be due to 
progressive dulling of motivation. In spite of efforts to keep motivation 
constant the high error counts occurred after several days work with 
high level and high frequencies. On the other hand, one might have ex- 
pected a gradual adaptation to the noise such as Harmon (24) found, 
in which case errors would have been least numerous toward the end. 
It is regrettable, though understandable in view of the lengthy demands 
on the subjects, that the study was not repeated using the noise in- 
tensities and frequencies in reverse order. 

Pollock and Bartlett (44) conducted a series of experiments using 
loud clicks and a complex noise delivered to head phones worn by the 
subjects while engaged in a simple eye-hand coordination task of pick- 
ing up ind replacing pegs in a moving peg board apparatus. The results 
indicated that the noise adversely affected performance initially but 
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adaptation set in rather quickly especially when the task was of such a 
nature as to permit thorough automatization. Considering long work 
spells of eight hours the difference in performance between noise and 
quiet was insignificant. In the case of mental work (word-making) 
noise, especially discontinuous complex tones, had an adverse effect on 
speed which was most marked early and late in a working spell of thirty 
minutes. Again, however, a plan of work, rendering performance more 
automatic tended to counteract adverse effects. No data are presented 
of the loudness or frequency of noises used, but performances appeared 
most affected by the arhythmical character and interestingness of the 
distractions. , 

Weston and Adams (57), in England examined the output of weavers 
over a period of twenty-six weeks during which the workers wore ear 
defenders on alternate weeks, thus reducing noise from 96 db. to 87 db. 
as measured by the Barkhausen audiometer (comparable to the Western 
Electric 3-A, although the reference level for the db. scale is not avail- 
able) output was one per cent greater while wearing the car defenders. 
However, output was largely controlled by loom speeds. The opera- 
tions by the weavers occupied about five minutes per hour without ear 
defenders, but with car defenders were reduced to four and one-half 
minutes, a speed increase of about twelve per cent. There was some 
evidence based on hourly records that even after years of work in a 
noisy environment, the worker does not become completely adapted to 
noise but goes through the adaptation process daily. 

Since the weather in the above study favored work on weeks when 
ear defenders were worn the same investigators returned to the same 
weaving shed to repeat their observations over a twelve-month period 
using two groups of 10 weavers each equated by supervisor’s ratings. 
They discovered a three and one-half per cent advantage in output and 
a thirteen and one-half per cent improvement in personal efficiency in 
the group wearing ear defenders which reduced the noise from 96 to 
81 db. 

The significance of these experiments lies in the apparent demonstra- 
tion that workers long accustomed to noise, and by no means disposed 
to complain of it, may show substantial increases in working efficiency 
as a result of noise reduction. Such a finding is obviously of economic 
importance in the manufacturing field. The investigation, therefore, 
merits careful critical consideration. 

The second experiment is open to the serious objection that the exact 
equality in production of the experimental and control groups was never 
demonstrated. The unreliability of the rating method might have pro- 
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duced a chance superiority for the experimental groups. It is regrettable 
that the investigators did not interchange the experimental and control 
groups. 

The influence of suggestion was not controlled. How important a 
factor suggestion may be can only be surmised. Baker (3) demonstrated 
with college students working for 5 minute periods under noise and 
quiet conditions that output can be lowered or raised depending upon 
subtle suggestions incorporated in the pre-experimental instructions. 
We have no way of knowing what suggestions were given workers in the 
Weston and Adams studies. However, since neither the work, nor the 
length of the work spells ip Baker's study are comparable to the in- 
dustrial situation there is no reason to expect suggestion would be a 
major factor influencing personal efficiency. 

Finally, the variability of percentage improvement in personal 
efficiency from month to month ranging from 31.42% to 3.42% sug- 
gests powerful uncontrolled factors influencing the differences between 
the groups. 

There remains, none the less, the fact of a consistent superiority of 
output under sound reduction. The authors* conclusions that * 'excessive 
noise is a factor having an important effect upon industrial efficiency" 
and that "tolerance is not acquired with lapse of time** are not irrefu- 
tably established, but the weight of evidence is in their favor. 

A study by Kornhauser (28) attempted to relate the output oi 
typists to noise in a regular office situation. The results were equivocal 
and complicated by the existence of other distractors in the quieter 
location. 

Using only two subjects, presumably the authors, Vernon and 
Warner (S4), reported that performance on an arithmetic test, a purely 
routine assembly job, and reading difficult material were not signifi- 
cantly affected by noise after a short adjustment period during which 
there were adverse effects. On the other hand judgments of the magni- 
tude of subjective disturbance indicated less rapid adjustment to the 
noise. These conclusions were based not only upon observations under 
laboratory conditions but included tests made in the presence of factory 
and office noises. 

In reviewing these reports dealing with production and accuracy of 
v/ork one is impressed with the methodological difficulties of conducting 
meaningful investigations "in the field.** Other conditions beyond experi- 
mental control, such as temperature, machine breakdowns, or changes in 
quality of raw materials may obscure whatever benefits noise reduction 
may contribute. It seems clear that a thorough study of industrial pro- 
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duction as contrasted with office work has yet to be done under positive 
control of all factors, covering work periods of sufficient length with 
noise typical of industrial situations. Furthermore, the factor of sug- 
gestion in the studies so far published has received little attention pos- 
sibly because it is difficult to control. 

Influences on Vital Processes 

The effect of noise on blood pressure, respiration rate, metabolism, 
muscular tension, electrical skin resistance, digestive processes and 
similar functions has attracted some experimental investigation. Among 
the most widely quoted of these is the report by Laird (29) who meas- 
ured the metabolic rate of four typists during a half hour of resting and 
while typing a standard letter over and over again for two hours per 
day throughout a four- week period. During the first and last of these 
weeks the walls of the testroom were bare. During the second and 
third weeks sound absorbing material was applied to walls and ceiling 
reducing the noise produced mechanically from SO ‘‘sensation units” to 
40 ‘‘sensation units.” 

On the average the metabolic rate was fifty-one per cent higher while 
working compared with the resting rate during the quieter weeks, and 
was seventy-one per cent higher during the noisy weeks. A suggestion 
that the noisy phase was more fatiguing is found in the data showing 
the average time for the last five letters of the two-hour spell was seven 
seconds less than for the first five letters in the quiet phase, while the 
comparable time was five seconds more in the noisier condition. How- 
ever, the fast typists improved in speed when the noise was reduced, 
while the slow typists showed little or no change in overall speed. 

No day-to-day data were published by Laird, a fact that has been 
criticized by Harmon (24), who found a very rapid adjustment to noise, 
using metabolic rate as the principal indicating function and mental 
arithmetic as the work. Such a task has a much lower muscular com- 
ponent then typing and could not be expected to affect the metabolic 
rate to a marked degree. It is important to note that on the initial days 
of noise Harmon found a* thirty-seven per cent increase in metabolic 
rate. During the same days the average number of errors per day was 
12 for quiet and 24.3 when working under noise. 

Prior to Laird's study Morgan (38) found evidence of similar extra 
effort in the presence of noise. His subjects engaged in a form of trans- 
lating letters into numbers by pressing appropriate keys. Respiration 
was recorded as well as the pressure exerted upon the keys. The noise 
came from a variety of buzzers, bells and phonograph records. Noise and 
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quiet periods alternated during the experimental sessions; the ten 
minute noise period following thirty minutes of quiet. Noise was re- 
sponsible for greater key pressure being exerted and greater articulation 
on the part of the subjects — a device spontaneously adopted in an ap- 
parent attempt to overcome the distracting effects of the noise. The 
time per response increased significantly at the beginning of each noise 
period but subsequently decreased to a point where output was more 
rapid than during quiet. 

Vernon and Warner (54), Freeman (IQ), and Ford (18) generally 
confirm the findings of Morgan and Laird so far as an increase in energy 
expenditure in the presence of noise is concerned. 

Davis (9) measured muscle tension in the presence of noise by means 
of action potentials. His eighteen subjects were required to do nothing 
except to sit and listen to the noise presented in a series of two-minute 
intervals separated by two minutes of quiet. Marked increases in ten- 
sion appeared on the first experimental day at the onset of each noise 
period. On subsequent days the change became less and less marked 
so that by the fifth day tension was virtually unchanged when noise 
was presented. 

The effect of noise on digestive processes was examined by Smith 
and Laird (49) whose four subjects swallowed a rubber balloon which was 
later inflated in the stomach. Peristaltic contract ions were thus recorded 
by an appropriate pneumatic system operating a writing point on a 
kymograph. Noise of 80 db. for ten minutes following a twenty minute 
quiet period produced a thirty-seven per cent decrease in the number of 
contractions. The recovery period following the noise showed no con- 
sistent pattern or change in stomach action. The effects using a 60 db. 
noise were less marked but of the same nature. The rate at which saliva 
and gastric juices flowed was also decreased in the presence of noise. 
These changes would presumably have an adverse effect upon the total 
digestive process. 

Although details of the experiment have not been published in full, 
Luckiesh (33) reported that a '‘demonstration visual test” was per- 
formed six per cent more quickly in a quiet room than in the presence 
of the hum from a motor-generator set. 

Other observations in the presence of noise of high intensities and 
short duration have demonstrated increased blood pressure (35) and in- 
creased cranial pressure (26). In the later case, however, one wonders 
to what degree the sound pressure itself may have affected the recording 
apparatus. 

Davis (10) reported a marked decrease in skin-resistance at the onset 
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of a five-minute noise period. However, the function gradually returned 
to normal and reached a value before the end of the noise period ap- 
proximating the value obtained with a control group sitting quietly for 
ten minutes. The noise level is not reported. 

Even though no experimental studies have been reported dealing 
with effect of noise on emotional control or emotional stability, the re- 
ports above indicate physiological changes of the kind associated with 
emotional disturbances. Many of the changes noted in response to 
noise are similar to, if not identical with those produced by stimuli 
exciting fear, anger, and general emotional tension. Throughout much 
of the published literature (IS) are testimonials that cannot be dis- 
missed lightly indicating that high noise levels are the cause of “nervous 
exhaustion” emotional instability and related conditions. 

Adaptation to Noise 

In some of the studies previously cited, notably those by Harmon 
(24), Morgan (38), Ford (18), Davis (10), Pollock and Bartlett (44), 
emphasis has been given to the process of adaptation or adjustment to 
noise. Morgan found an initial decrement in output followed in some 
instances by greatcr-than-normal production. Harmon reported an ini- 
tial increase in metabolic rate followed by smaller and smaller incre- 
ments on succeeding days at the onset of noise. Pollock and Bartlett’s 
subjects reported they could hardly bear the noise in the early experi- 
mental sessions yet at the end they did not notice the noise. These re- 
ports do not fit with the long term studies by Laird, Aetna Life In- 
surance Company, Weston and Adams, which show a net deleterious 
effect of noise. It is possible that the differences in results can be at- 
tributed, in part, to the factor of suggestion as already mentioned in 
the previous section. On the other hand it is not reasonable to expect 
that a higher-than-average efficiency index could be maintained without 
a break over a full year of operation in a group of fifteen clerks by sug- 
gestion alone, as the report of the Aetna study shows. It is also true 
that the studies showing an adjustment to noise such that the detri- 
mental effects are reduced or eliminated cover, for the most part, short 
noise periods. We may hazard the guess that if the noise were con- 
tinued over longer periods comparable to those experienced in industry 
the adjustment mechanism compensating quickly for the initial ill 
effects may break down and the net effect of noise even in later studies 
might have been adverse. In line with this hypothesis is Freeman’s (19) 
idea that the added effort noted in the presence of the initial noise is 
not just a rise in total energy expenditure but comes from a shift in the 
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pattern of supporting processes, not directly concerned with the prin- 
cipal functions (typing movement, etc.) under observation. With con- 
tinued exposure to the noise this irradiation of excess tension becomes 
less prominent and a more economical concentration of supporting 
processes reasserts itself. 

Factors Affecting the Annoyance of Noise 

Throughout the studies mentioned above are frequent references to 
the erratic manner in which noise acts as a distractor. Only a few studies 
have been reported which were aimed at discovering the features of the 
noise which contributed to annoyance. Using a paired-comparison 
method with pure tones at 50 db., Laird and Coye (31) found the an- 
noyance of tones to be ranked in the following descending order: 8192, 
4096, 64, 2048; with 128, 256, and 1024 of minimum and equal an- 
noyance. Thus, other things being equal, high tones and extremely low 
tones are judged more annoying then those in the middle range. The 
effect of intensity was examined by matching in annoyance a 256 cycle 
tone at various intensities with other pure tones adjustable in loudness. 
The equal annoyance contours thus established below 500 cycles fol- 
lowed roughly the equal loudness contours shown in Figure 1 . Above that 
frequency the equal annoyance contours tended to drop away from the 
equal loudness contours, indicating that low tones of high intensity are 
as annoying as high tones of relatively low intensity. 

The observations of acoustical engineers (40) testify to the im- 
portance of high frequencies as they affect the reported comfort of 
treated areas. In one case two adjacent offices were acoustically treated, 
but with different materials both having the same noise level and similar 
sound sources. The absorption coefficient at 4096 cycles for Office A 
was 0.45 and for Office B was 0.82. B was rated by its occupants as 
comfortable while those working in A complained that the typewriter 
noise was not properly quieted. Replacing the lower coefficient material 
by the other type corrected the complaint, in spite of the fact that the 
overall Noise Reduction Coefficients for the two materials were ap- 
proximately the same. 

Interrupted noise or discontinuous tones have been generally found 
to be more annoying than steady noises. This was true in the reports by 
Pollack and Bartlett (44), and Laird (24). 

Furthermore Finch and Culler (13) found greater auditory destruc- 
tion in dogs subjected to high intensities when the tones were inter- 
rupted Even the reports of Morgan (38), Davis (10), Freeman (19), 
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Harmon (24), and others who have used relatively short noise exposure 
periods would lead one to expect that discontinuous noise would be the 
more detrimental. In all of these reports the initial onset of noise pro- 
duced the greatest adverse effect. Hence the discontinuous noise would 
be somewhat comparable to a rapid series of noise periods each of which 
demands a new adjustment with initial phases of maximum cost to the 
individual. 

A number of writers have referred to a miscellaneous list of factors 
that influence the annoyance of noise, none of which have been subjected 
to systematic investigation. Among these factors are the unexpected- 
ness of noise (42), the spreading effect and reverberation (47) the degree 
to which the noise is unnecessary or indica*tes malfunctioning of equip- 
ment (2). 

Of primary interest to psychologists are the individual differences 
in noise tolerance. Numerous independent observers in the field (47) 
as well as nearly all laboratory investigations agree in reporting wide 
differences in the degree to which persons are affected by the same 
noises. Related to this factor is the wide variation in response by a 
given person on different occasions. Shifts of attitude, motivation, and 
attention result in widely different types of performance under the 
same external noise conditions. No systematic attack has been made 
on this problem to discover what personality feature or other psycho- 
logical factors are correlated with noise tolerance. 

Stimulation Deafness 

Kemp (25) has recently reviewed the studies dealing with stimula- 
tion deafness. It is, therefore, unnecessary to review that area except to 
reemphasize the necessity of interpreting the reported incidence of 
alleged occupational deafness in the light of the normal decrement in 
acuity with age. The most complete information dealing with incidence 
of hearing losses in the general population is that gathered by the Bell 
Telephone Laboratories at the New York and San Francisco Fairs in 
1939 (50). In spite of this precaution which prevents the acceptance of 
many reports at their face value, there is little doubt that extended ex- 
posure to intense noise will produce both a temporary and permanent 
loss in auditory acuity. 

Much of the experimental work in this area has been done on ani- 
mals in an effort to throw light on the mechanism of hearing. The re- 
sults have shown in genera! that the frequencies suffering the greatest 
impairtnent are not correlated with the predominant exposure fre- 
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quencies, although the degree of impairment is related to the duration 
and intensity of the exposure noises (8, 43, 46, 55). 

Although numerous reports are available purporting to show that 
workers in high noise levels have a greater-than-average incidence of 
hearing loss (22, 34, 35, 36, 37, 40, 45, 52, 53), the devices for measuring 
hearing have varied widely making any comparison between occupa- 
tions hazardous. One survey (61) has been published of nine plants in 
New York State conducted in a standard manner using a Western Elec- 
tric 3-A audiometer. The report concludes that the greatest percentage 
of deaf people occurs in plants of greatest noise. However, it must be 
pointed out that not all employees in the plants of greatest noise were 
tested. It is possible that a 'selective factor operated to exaggerate the 
proportion of deaf people tested in those plants. 

There appears to be no reasonable doubt that high noise levels -can 
and do produce hearing defects especially after long exposure. Still un- 
answered are the questions, what levels can be tolerated, and for how 
long without incurring a defect? 

Conclusion 

In spite of a wide spread interest in noise abatement relatively few 
facts have been well established. Popular literature not covered in this 
review abounds in emotional outbursts against the baneful effects of noise. 
Public support has been enlisted for noise abatement campaigns on the 
uncritical acceptance of the assumption that noise because it is an- 
noying must be harmful. The available scientifically controlled studies 
arc not in complete agreement but tend to show ill effects in output, 
speed of work or vital processes. Although a considerable degree of 
adaptation takes place, the evidence suggests that it is seldom complete. 
Marked individual differences in susceptibility to the ill effects of noise 
have been noted but no reported attempts have been made to correlate 
these differences with other facets of personality. The factors determin- 
ing annoyance have not been subjected to thorough analysis. Stimula- 
tion deafness Is an unquestioned result of exposure to loud noises for 
long periods but its extent and the critical levels of noise necessary to 
produce it in humans have not been clearly established. In summary, 
it is clear that there are many circumstances under which noise detracts 
from efficiency and well-being. Under what circumstances noise is 
deleterious and for what kinds of people are questions for further fruitful 
research. 
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STUDIES IN TIME PERCEPTION 


A. R. GILLILAND, JERRY HOFELD and GORDON ECKSTRAND 
Northwestern University 

The study of time takes various forms and presents problems to 
investigators in many fields. For example, the nature of time is of spe- 
cial interest to the philosopher. The accurate measurement of time is 
important for the physicist and the astronomer. In a somewhat related 
sense, time is a very practical problem for the daily worker who punches 
a time clock, the athlete who competes in games and the aviator in his 
maneuvers. The perception of time is a subject of special interest to the 
psychologist. 

In this paper we shall be interested in time perception and time esti- 
mation and shall be concerned with such problems as the nature of time 
only in so far as it is necessary for a definition of our problem. In this 
quest we will not be greatly aided by such definitions of time as that 
proposed by Kant who said that time is “a universal form of intellect** 
or by Leibnitz who described it as ‘‘the obscure and confused picture 
of the grounds which determine the order of succession.’* 

The physicist’s definition of time, as the measurement of the move- 
ment of the earth through space, is more helpful but it is not entirely 
satisfactory for our purposes. Neither is the somewhat naive opera- 
tional definition of the psychologist who says that time is what the 
clock measures, much more helpful. This is true for the double reason 
that time existed and no doubt was perceived before the advent of clocks 
and secondly, because when we ask what it is that the clock measures 
we are back to our original question. 

Such considerations in the absence of any more constructive material 
leads us to the tentative description of time as our perception of one of 
the two essential characteristics of movement. One of the essential 
characteristics is space and the other is time. This brings out the close 
relationship between time and space. Such an explanation smacks 
strongly of the theory of relativity and fits nicely with Gestalt theory 
in psychology. However, this need not necessarily constitute a criticism 
of the definition. Neither does it inevitably commit us to a purely 
Gestalt explanation of time since such time-space relations may be 
assumed to be the result of our experiences rather than nativistically 
given. 

If we wijh a more specific definition of time it may be described as 
the pt;ccption or probably better stated as the judgment of the amount 
of separation of two units or limits as greater or smaller, in terms of 
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some unit such as seconds. This judgment is reached from an estima- 
tion of how long it would take to accomplish so much or so many things 
such as to walk a block, listen to so much or to feel rested or fatigued 
by so much work. 

While many theories have been proposed and many experiments con- 
ducted on the subject of time perception or time estimation* as yet 
there is no generally accepted view as to how we perceive or estimate 
time. Studies in time estimation might well be classified under several 
related but somewhat different problems such as, — 

1. The estimation of relatively short time intervals. Most of the experi- 
mental work in time has dealt with the estimation of time between two signals 
up to a few minutes in length. In most cases the estimator has been forewarned 
that an estimation was to be made. 

2. The estimation of time when the subject is not expecting to make such 
estimation. For example, if the reader was suddenly asked to estimate how 
long had been spent in reading this paper up to this point, we do not know 
whether the same methods of estimation would be used as in case he had ex- 
pected to make this estimation. 

3. It is also not certain whether the same methods arc used in estimating 
and in reproducing time intervals. In estimating time a unit of estimate such 
as a second is necessary. In reproducing time, a unit is at least not so necessary. 

4. The perception of longer intervals involving hours or days may be similar 
or may be different from estimating shorter periods of time. Certainly some of 
the same factors are involved but some new ones may also be involved. 

5. There also is the question of how lower animals perceive time. Do they 
use the same methods as humans? At least there certainly is little “mental 
content” used in their estimations. 

6. Then there is the problem of estimating the time of day or night. This 
at least in modern society involves direct reference to some instrument of time 
measurement such as the clock. In primitive society the sun or moon was, no 
doubt, generally used as the reference point. That our modern methods are only 
a refinement of this method is indicated by our usual methods of indicating 
time by so many hours, minutes and seconds since noon or midnight. 

In the following sections of this paper we shall review the principal 
recent studies that have dealt with the problems listed above. Since, 
research has not proceeded as logically as the outline, this review will 
follow the various lines orrescarch. 

Since there have been excellent reviews of the literature on lime esti- 
mation in a scries of articles by Dunlap (13) from 1911 to 1916 and a 
later review by Weber (56) in 1933, this article will in general be limited 
to the literature since the later date. There will, however, be occasional 

• The phrases perception of time and estimation of time are here used somewhat inter- 
changeably. In a sense they are not strictly synonymous. Time estimation always im- 
plies a kind of quantification not necessarily included in the time perception. 
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violations of this principle (1) in order to emphasize some point of view 
developed during the earlier period and (2) in order to summarize studies 
in time perception in animals. This later topic was not included in the 
earlier reviews and it may well have an important bearing on research 
in time perception. 


Cues for Time Perception 

One of the important problems in time perception is that of the cues 
used. In fact, most of the experimental studies in time perception can be 
related more or less directly to this problem. In the closely allied field 
of space perception the principal cues as well as their relative impor- 
tance, as Carr (8) has so w'ell pointed out, have been very carefully 
investigated. Although many studies have been conducted, no such 
list of cues has been discovered for time perception. In fact, almost the 
opposite is true. That is, we are hardly sure of any of the cues that are 
used in time perception. 

Several of the earlier studies reviewed by Weber (56) investigated the 
relationship between certain physiological processes such as breathing, 
fatigue or digestive processes and time estimation. Some of these studies 
found evidence for such relationships but the evidence was never con- 
clusive. 

In some recent studies in this field Rosenzweig and Koht (45) com- 
pared time estimation and what they called “need tension.” They 
defined need tension as a state of strain. They found that during periods 
of greater need tension there was more underestimation that when the 
tension was not so great. 

A somewhat similar point of view was presented earlier by Francois 
(17, 18) and later by Hoaglund (30) who believed that time is mediated 
by internal temperature fluctuations. For example, Francois found that 
when the internal temperature was increased by high frequency cur- 
rents, the tapping rate of the subject was increased. From such kinds 
of studies these two writers believe we have some kind of internal time 
clock for the perception of time. 

Kawasima (37) had subjects estimate time while moving their arms 
in ten degree circles. Overestimation occurred when the action was 
easy and underestimation when it was difficult. 

Shaefer and Gilliland (48) systematically varied (1) pulse rate (2) 
heart work (3) blood pressure (4) breathing rate and (5) lung work, in 
order to determine the effects of each on the estimation of time. No 
relationships were found between any of these variables and time esti- 
mation. Even when all these changes occurred together producing a 
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state of great physiological activity they seemed to have no influence 
on the direction of error or accuracy of time estimation. 

Berman (5) investigated the relation of time estimation to what he 
called satiation. A stylus maze was used to produce satiation. When the 
subjects were satiated, 87 per cent of his subjects underestimated the 
time required for them to become satiated while 52 per cent of the non- 
satiated group overestimated the time required to reach their criterion 
of learning. 

Jasper and Shogass (34) found that the alpha rhythm of the brain 
was in no way related to conscious estimates of time. 

Certain drugs and disease affect the estimation of time. It is common 
knowledge that the drunk man is generally disoriented as to time as 
well as space. 

Favilli (15) states that under mescal intoxication time estimation is 
very poor. Forty-five minutes was estimated as IS. But sometimes the 
estimates are abnormally short and sometimes abnormally long. Favilli 
explains this by suggesting that in the into.xicatcd state the shortening 
of the ‘‘field of consciousness” results in a sort of contemplation of the 
present instant which lacks a frame of reference. This seems to be a way 
of saying that the subject is always living for the present and lacks the 
usual perspectives for estimating time. If this be true, mescal intoxica- 
tion may not have a greatly different effect upon time estimation than 
alcoholic intoxication. Marijuana also seems to produce a somewhat 
similar effect. It is sometimes smoked in cigarettes by orchestra leaders 
and drummers with the idea that it assists them in rapid timing. 
Whether it does or not seems not to have been investigated experi- 
mentally. 

Sterzinger (49, SO) gave subjects several different drugs for the pur- 
pose of determining their effects on time estimation. After taking 
quinine subjects habitually underestimated 5 minute intervals. Alcohol 
produced too low an estimate for periods less than 25 minutes and too 
high for longer periods. Caffeine and thyroxin gave inconsistent results. 
Sterzinger came to the conclusion that time estimation depended upon 
metabolic rate. 


Filled and Unfilled Time 

One of the early topics to receive experimental study in the field of 
time estimation was the effect of filled and unfilled time. That is, what 
effect does the type or amount of mental content or physical activity 
have upon its apparent length? Although the literature of this topic is 
extensive the results, as will be seen from this survey, are by no means 
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consistent. Some results seem to indicate as William James (35) sug- 
gested that filled time while passing is perceived as longer than it really 
is. On the other hand many studies have obtained the opposite results. 
One explanation for such discrepancies is the fact that it is difficult to 
determine whether a time interval is filled or empty. The interval may 
be filled with a large amount of visual, auditory, or other type of ex- 
ternal change. However, these may have little or no attention value for 
the subject who is estimating the time. On the other hand, the time in- 
terval may be relatively quiet so far as external stimuli are concerned 
and still be richly filled with mental content for the subject. For ex- 
ample, he may be thinking, “now one second, now two seconds, three 
seconds, etc.” Many of the seeming inconsistencies in the results for 
different investigators are due to factors such as those just enumerated. 

Much of the early experimental work on filled versus unfilled timfe 
was summarized by Triplett in 1931 (52). 

Later studies by Helm (29) considered subjective factors such as 
attention as important in time estimation. Harton (26, 27, 28) found 
that activities of varying difficulty influenced time estimation. Time 
spent in making difficult discriminations was estimated as less than 
equal periods spent in making easy discriminations. In working at tasks 
in which the subjects were encouraged and felt they were being success- 
ful, the time was estimated as shorter than when they were discouraged 
and felt they were being unsuccessful. 

There could hardly be more inconsistency in results than those found 
in the various studies in estimating filled and unfilled time. Does filled 
time appear longer or shorter than unfilled time? Does William James’ 
dictum that filled time seems shorter in passing and longer in retrospect 
hold true? As already suggested, no single answer can be given to this 
problem except that time estimation depends upon other factors than 
whether the time is filled or unfilled. In so far as filled time may more 
often appear longer than unfilled time, it is largely a matter of the sub- 
jective filling of the time. 

Recent studies by Abbe (1, 2) investigated the effect of increasing the 
amount of physical space upon time perception. Two blinking lights 
were used to delimit the time interval. The distance between these lights 
were varied but the time between their flashes was left constant. Abbe 
reports that time intervals were reported as longer when the space be- 
tween the lights was greater. In a related experiment he also found the 
converse to be true, that is, if space intervals were given in a longer 
interval of time the space between the lights was judged as longer than 
when given in a shorter time. Tashiro (51) in a similar experiment in 
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which he could arbitrarily vary both the rate of tapping and the space 
between which the taps were made, concludes that the rate of tapping 
built up a gestalt with the result that the meaning of each tap was lost 
irrespective of the tapping speed. However, the tempo of the tapping 
which was determined solely by the spatial distance between the points 
struck in the taps was more powerful in producing the effect of gestalt 
movement than the rate of tapping. Ruder t-Kotte (46) holds that 
binocular parallax, so important in space perception, may give clues to 
time perception through the space concept as well as through the slight 
time variants due to the fact that we see through two eyes instead of one. 

While such studies as those just enumerated do not prove conclu- 
sively any theory of time perception, they give us material which must 
be taken into consideration in the formulation of any theory and they 
strongly indicate that there is a close relationship between the amount 
of mental content and the estimation of time. Space is directly related 
to time in that when space is extended this gives the impression of 
greater mental content. Yet this no doubt is entirely too simple an ex- 
planation of time and will need further elaboration as we proceed. 

Individual Differences in Time Perception 

Three problems relating to time perception will be discussed in this 
section. One of these is the determination of the indifference point in 
time estimation. Another deals with the Weber-Fechner law and the 
other with sex differences in time estimation. 

Most of the work on the indifference point in time estimation oc- 
curred before 1933 and was reviewed in Weber’s (56) article. The 
results may be summarized by saying that different investigators have 
obtained very different results. 

Similar inconsistencies have been obtained in determining whether 
Weber’s law holds in time estimation. Some evidence indicated that 
errors are proportional to the length of the interval and time studies 
show no such relationship. In more recent studies Gilliland (20) using 
over 300 subjects with a total of about 5000 estimations ranging in 
length from 4 to 27 seconds fpund evidence favoring the Weber-Fechner 
law. But in another study by Gilliland and Humphreys (21) the per 
cent of errors was found to be significantly greater for the shorter than 
for the longer intervals. In this study the intervals ranged from 9 to 
180 seconds. It seems likely that the difference in results in these later 
studies was due to the wider range of time intervals in the later study. 
It seems probable that since time is not mediated by any one sense but 
through several different senses no single constant could be established 
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in the Weber-Fechner formula. This may be especially true when the 
intervals vary greatly in length. 

In the earlier studies of sex differences women were generally found 
to be poorer than men in time estimation. But in later studies such as 
those by Harton (26, 27, 28) and Gilliland and Humphreys (21) no 
such differences were found. It seems likely that the modern woman is 
called upon to make time estimates as often as men. Since time percep- 
tions are primarily dependent upon learning the sexes probably do not 
differ in ability to estimate time. 

Time Perception in Lower Animals 

Some knowledge of how time is perceived may be gained from a 
study of time perception in lower animals. Because we do not believe 
that the lower animals, at least, use mental imagery to anything like 
the extent that humans do, any theory that is based upon mental con- 
tent alone tends to be discredited. 

The principal experimental studies of time perception in lower ani- 
mals may be classified under the general headings of (1) delayed choice 
response, (2) delayed conditioned responses, and (3) time perception. 

There is considerable doubt whether delayed choice responses should 
be considered under the heading of time perception. The delay of the 
choice response may not be due to an appreciation of time interval by 
the animal making the delayed response. Nevertheless, a brief account 
of such experiments will be included here since the length of the delay 
is one of the important factors in such reactions. 

Hunter (33) using a multiple choice apparatus, measured the time 
that could elapse between the presentation of a light before the coricet 
choice box and the release of the animal which would result in the 
animal going to the correct box. In a three choice box, Hunter found 
that for different rats the time could vary from one to ten seconds but 
the average time was not much more than one second. For dogs the 
time averaged about ten seconds and for raccoons it averaged about 20 
seconds. 

Walton (54) using a different kind of choice box obtained delays up 
to one minute in dogs. Warden and Warner (55) made tests on the noted 
dog Fellow and obtained delays of about 20 seconds between spoken 
commands and their execution. Yarbrough (59) obtained from two to 
four seconds delay in cats on a three compartmental and 16 to 18 sec- 
onds on a two compartment maze. But Corvan (9) obtained delays of 
30 seconds with a Persian cat. In contrast with the previous studies 
Honzik (31) secured delays up to 45 seconds with rats. 
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The great variance of these results no doubt partly illustrates indi- 
vidual differences in animals but it also illustrates differences in methods 
used and apparatus. Because of these differences in results no very defi- 
nite generalizations can be made other than to point out that the higher 
the animal in the scale, the more likely that the delay can be longer. It 
is also true that in the lower species the delay is maintained in terms of 
bodily posture whereas higher animals can change posture and still 
react correctly after a delay. 

Several authors have studied time responses in invertebrates. Ver- 
laine (53) taught wasps to discriminate between alleys of different 
lengths. He thought this showed a ‘'concept of time.” Braunschmid 
(7) made studies on time responses of fishes while Grabensberger (23) 
worked with ants and Sterzinger (49, 50) with ants, bees, and wasps. 
Richter (44) and Johnson (36) have studied the diurnal rhythms of 
rodents. Johnson found that this rhythm developed in wild mice born 
and reared in utter darkness. Beling (4) working with bees found that 
he could condition them to return for feeding any time of day or night. 

Most of this work shows that some kind of relationship exists be- 
tween diurnal rhythm, metabolic rate, and some sort of a time sense. 
But this sense of time is no doubt very different from time perceptions 
in humans. 

Kuroda (40) by means of delayed conditioning found that rats were 
quite accurate up to about 20 minutes and that cats could be given de- 
layed conditioning up to about 30 minutes. Anderson (3) experimented 
on time discrimination in the white rat. He used a multiple choice 
maze and confined the rat for different periods of time. Rats were able 
to discriminate between periods of 10 and 20 seconds. Kuo (39) con- 
structed a maze with four pathways to the food box. Sonic were longer 
than otliers and in one the rat was confined for 20 seconds. The rats 
learned to go by the shortest route and avoided the one in which they 
were confined first of all. These results were further substantiated by 
an experiment by Sams and Tolman (47). Of course, it must be realized 
that other factors than time perception affect the order of elimination of 
pathways in such experiments. 

Woodrow (58) taught two monkeys to differentiate between tem- 
poral intervals of 1.5 and 4.5 seconds. They were taught to reach for 
food upon the raising of a screen after the longer but not after the shorter 
interval. The monkeys became 90 per cent correct for these time inter- 
vals and 75 per cent correct in distinguishing between 1.5 and 2.18 
seconds.' Metfessel and Bobbitt (41) found that canaries could be taught 
to discriminate and produce rhythms according to a prearranged pat- 
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tern. They acquired an accuracy with an average deviation of only .01 
seconds. 

Learning in Time Perception 

While some writers claim that the perception of time like other kinds 
of perception is largely dependent upon certain innate factors, probably 
no one would deny that learning plays a part in the process. The im- 
portant questions are how does learning take place and how much of a 
part does it play. As indicated in the early part of this paper there are 
probably several different problems relating to time estimation. If so, 
then there are several problems related to learning in time perception. 

The effect of different cultures on the development of time concepts 
is stressed in two papers. KIoos (38) states that the Persians stress the 
future while the Chinese stress duration, and the Indians stress the stag- 
nation of time. Rodulescu-Motru (43) believes that the cultural effect 
amounts to a process of evolution whereby time is evolved through 
periodic changes in nature, in the body, and in culture. Thus he claims 
that part of the process of time perception is genetically inherited. 

In the field of the experimental study of the time perception in 
children Elkine (14) reports that in children from ten to sixteen years 
of age there were large errors in time estimation. He found average 
errors of 82 per cent for short intervals up to 4.1 seconds and 57 per 
cent for longer intervals for the younger children. In sixteen year old 
children the error was 32 per cent for short intervals and 30 per cent for 
longer intervals. Gilliland and Humphreys (21) found average errors 
ranging from 32 to 47.5 per cent in fifth grade children as contrasted 
with errors of 15 to 31 per cent for adults. No doubt the greater accuracy 
in children in the latter study as contrasted with Elkine’s study is due 
to the fact that Elkine studied Russian children from the home of work- 
ers whereas Gilliland and Humphreys studied American children froni 
better homes. Such children have more occasion to be influenced by 
time and thereby learn the meaning of time intervals at an earlier age 
than the Russian children. 

Several studies have investigated what sense departments are best 
suited for perceiving time. Goodfellow (22) found that, considering 
audition as 100 per cent, vision gave 65 per cent and the tactile sense 
34. Other sense departments were even lower than any of these. Gault 
and Goodfellow (19) made two studies; one concerned with discrimina- 
tion efficiency and the other with reproductive efficiency. In both the 
auditory sense was most efficient, vision second and touch was third. 
Forbenius (16) had subjects wake from sleep at previously arranged 
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times. He reports that they became able to do so within an accuracy of 
about five minutes. Wirth (57) taught subjects to reproduce given time 
intervals. Dudycha and Dudycha (11) asked 185 subjects to estimate 
the amount of time it would take them to perform six different tasks. 

Gilliland (20) taught a group of 13 subjects to estimate time by 
counting. After each estimate they were told the length of the interval. 
In 25 to 50 practices these subjects were able to reduce iheir average 
error from 20 to 25 per cent to 5 to 10 per cent. A similar group with 
an equal amount of practice but without counting did not improve in 
their estimating. 


Theories of Time Pi^rception 

Despite the large amount of disagreement in results in studies on 
time, many theories have been proposed to explain the perception of 
time. Most of the older theories postulated some kind of a time sense. 
The theory presented by Francois (17, 18) and Hoaglund (30) assuming 
the existence of some kind of an internal chemical time clock comes 
under this general heading. But later studies have not confirmed this 
point of view. 

Psychoanalysts generally attribute time perception to some phase 
of the self. Hugenholz (32) explains time <as follows; “The covert human 
time-form can exist only when there is a consciousness of self." Bona- 
parte (6) tells us that the child in whom the unconscious dominates has 
only a vague concept of time as in day dreaming and other forms of 
imagining. Dooley (10) explains time as an "Ego-defense properly 
placed in that class of defenses comprised of attachments to real objects 
in counterdistinction to narcissistic attachments." Just what is meant 
by such explanations and how' convincing they are, is left to the reader. 

Obciidorf (42) explains time as "primarily dependent upon the real- 
ity with which recurrent physiological reflexes are registered. It becomes 
distorted in the absence of the sense of reality and in cases of profound 
interference with the perception of reality time sense may cease to exist. 
Another factor which is strongly influential upon time perception is the 
presence or absence of purpese, for without purpose the value of time 
greatly diminishes." Sterzinger (49, 50) as has already been stated 
believes time perceptions are based upon body metabolism, when me- 
tabolism is high, time passes quickly, w^hen it is low, time drags. This 
bears a close relationship to the theory of Francois and Hoaglund and is 
essentially the theory proposed by du Nouy (12). Dunlap (13) stresses 
the relafionship between time perception and bodily rhythm. 

A number of writers emphasize the Gestalt point of view in time per- 
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ception. Weber (56) stresses the relationship between time and space. 
In a series of experiments by a number of Gestalt writers the close rela- 
tionship between time and space is emphasized. This relationship is 
expressed by the formula “phenomenal velocity equals phenomenal 
space divided by phenomenal time or v = s/t” This is further empha- 
sized in the statement “these variations in the flow of phenomenal time 
are not occasional examples which may be explained as illusions, but 
that they are continuous and are conditioned by almost any change in 
the structure of the entire field of movement.*' 

While the evidence is either too conflicting or inconclusive to prove 
or disprove any theory of time perception, an analysis of the data lends 
support to certain general principles about how time is perceived. May 
we here sketch briefly some of these. 

As has already been stated the evidence does not support a theory of 
any direct sensory basis for time perception. Unlike space perception 
and the estimation of distance time is less directly dependent on sense 
experiences as such. In the case of lower animals a discrimination of 
longer or shorter intervals and even some vague notion of time are, no 
doubt, mediated by physiological processes and body rhythm. 

In children this vague awareness of time is further amplified by 
learning units of time such as a second or hour. This knowledge is 
gained either directly or indirectly from timepieces. The significance 
of these intervals are interpreted in terms of body tensions and rhythm. 
As the child gains further experience it learns more about evaluating 
how much activity can occur in any given time. The sensory cues can 
come from any sense modality and the resulting activity may be physio- 
logical or mental. 

In the case of the rat or dog in the delayed conditioned response it is 
certainly mostly physical. In the case of the reader in estimating how 
long it has taken him to read this paper his estimate is probably based 
upon his estimate or knowledge of how many pages he has read and at 
what rate he reads a page. Likewise if a person knows that he ordinarily 
walks about five miles per hour when he has walked what he thinks is 
about three miles he will estimate that he has been walking for 40 
minutes. If a subject is not paying attention to what is going on during 
the interval or if he has no good idea of the amount that has taken 
place or of units for measuring the interval, his estimates will be poor. 
Training as shown in learning the normal rate of one’s counting during 
an interval will soon make a person able to estimate time very accu- 
rately. Without some kind of definable content during the interval, 
estimation remains poor. 
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In retrospect estimation depends upon the memory of events occur- 
ring within any interval. If the interval is filled with many events it will 
seem long in recall. If it is uneventful it will be remembered as short. 
Similarly in immediate estimation the estimation will depend upon the 
amount that has occurred or seemed to have occurred during the inter- 
val. It is for this reason that when we are doing interesting things time 
is generally underestimated. In waiting or boredom time passes slowly. 
However, there is always a correction that the estimator may make. If 
a person knows that time is underestimated during pleasant happenings 
and overestimated in boredom he may correct or often overcorrect for 
these influences. It is, no doubt, in large measure because of these facts 
that the literature on filled and unfilled time is so hard to interpret. 

According to this theory time estimation is not nearly so directly 
given from sense data as in space perception and estimation. Time 
estimation partakes more of the nature of a judgment rather than a 
perceptual process. The fact that in space perception the cues remain 
for further examination whereas in time perception after the event it is 
always in retrospect, helps to make time estimation more of a process 
of judgment than of perception. 

What has thus far been presented as a brief theory of time perception 
applies particularly to short intervals. Possibly longer intervals are 
estimated in somewhat the same way but more often with greater ref- 
erence to external events such as the position of the sun and familiar 
events such as the amount of traffic on the street as in estimating the 
time of day. Of course, when a timepiece is available it will be used. 

In an earlier part of this paper six general fields of research in time 
perception were indicated. These are areas in which further research 
could add much to our knowledge about time. In addition to these gen- 
eral fields of research much insight might be gained about the nature 
of time from a further study of time estimation in animals. The evolu- 
tion and improvement of time perception both in children and adults 
would also, no doubt, throw further light on this subject. Such studies 
would have both practical value in showing how to train persons in 
time estimation and would extend our knowledge about the psycho- 
logical nature of time. 
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Steiner, Lee R. Where do people take their troubles? Boston : Houghton 
Mifflin, 194S. Pp. xiii+26S. 

What do we psychologists do when our personal problems become 
so severe we are ready to jump off a tall building or take other drastic 
action? More adequate adjustive behavior involves consulting with a 
respected colleague or a psychiatrist. Mrs. Steiner is concerned with 
what the great gullible public does with its problems. Her book indi- 
cates that a surprisingly large number of otherwise fairly intelligent 
people consult such purveyors of psychotherapy as newspaper colum- 
nists, radio “consultants,” marriage oy “get acquainted” bureaus, 
religious faddists, spiritualists, palmists, and numerologists. These 
“therapists” compensate for their lack of formal training by assuming 
an air of great self-confidence and something approaching omniscience. 

The book begins with a useful discussion of “What is a psycholo- 
gist?” It is pointed out that there are recognized courses of training 
preparing people to be psychiatrists, psychiatric social workers, psycho- 
analysts, and clinical psychologists. Examination of telephone direc- 
tories and contacts with license bureaus and health departments in big 
cities showed an almost complete lack of regulation of people dispensing 
psychotherapy as long as they avoided hypnosis and the treatment of 
serious mental disorders. 

What sorts of therapy and therapist appear in this wide open field? 
It is easy to discount “graduates” from the “College of Divine Meta- 
physics.” This institution confers a “Doctor Degree in Psychology” on 
the basis of two courses, costing fifty dollars each. A surprising thing 
is the large number of similar institutions giving “degrees.” A common 
characteristic of most of the “therapists” discussed by Mrs. Steiner is 
their lack even of one of these dubious degrees — one simply adds to 
one’s name a set of letters like Ms.D. or Ps.D. After all, why bother to 
go to school when your practice is based on common sense, of which 
you clearly have more than your share? 

As a circulation-buildcr the newspaper columnist-consultant is 
probably useful to the publisher. However, the advice given may often 
be harmful and is rarely helpful. Mrs. Steiner sent a copy of the same 
letter to a number of leading columnists, briefly outlining a mother’s 
problem with a disobedient child. One advocated sending the boy to a 
special school, and several others proposed stern discipline of one sort 
or another. Similarly, advice based on inadequate evidence is freely 
given on a number of radio programs listened to by millions. Both 
types of “advisors” have in common a certain incompetence and an 
engrossing interest in the commercial aspects of their ventures. 

The alleged therapist who deals with an office clientele is most ob- 
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viously the profiteer on others’ emotional problems. Mrs. Steiner de- 
votes the greater part of her book to a carefully documented report of 
her experiences with these people and their ‘‘patients.” We can visit 
with her the ‘‘Human Audit Bureau” where we are given vocational 
guidance on the basis of our brain development as measured with special 
calipers. If we need to find companionship of a sort, we can always 
contact a marriage broker or some introduction service. After all, ‘‘Why 
let lonesomeness rob you of romance, happiness, and comfort?” What 
shall we do if we are really interested in religion but are not satisfied 
with the regular recognized churches because they don’t seem to solve 
our personal difficulties? This problem is “easy.” There are many 
local “churches” to take care of us — would we prefer spiritualism mixed 
with a little hypnosis, or stick with the “I Am” system? Perhaps we 
are prejudiced in favor of something like palmistry, graphology, astrol- 
ogy, numerology, or tarot; many practitioners of these occult sciences 
parade through the chapters self-confidently if somewhat denuded of 
respectability. 

Mrs. Steiner is not one of the smoothest of writers. This reader’s 
sensation was often one of bumping along from one section to another 
with little help in transition. The overwhelming mass of apparently 
scrupulously recorded and reported experiences making up the book’s 
documentation probably contributes to this feeling of disjointedness. 
However, the subject material of the book will be of great intrinsic inter- 
est to any professionally-minded psychologist, and a careful reading will 
repay anyone with even the glimmerings of a “social conscience.” 

A review would be incomplete without at least a mention of some of 
the book’s obvious possibilities for professional use. One could: 

1. Recommend it to his more intelligent students in clcmentarj' or clinical 
psychology courses as supplementary reading material. 

2. Pass it along as encouragement or reminder to committees in profes- 
sional organizations engaged in formulation and acceptance of standards to be 
met by practicing applied psychologists. 

3. Use it to strike lightly those psychologists who have obstructed the set- 
ting up of such professional standards. 

4. Bring it to the attention of legislators and point out similar conditions 
existing in their own constituencies, hoping that legislative support may thus 
be obtained for regulation of people unwarrantedly calling themselves psy- 
chologists. 

5. Write letters quoting its conclusions to newspapers and periodicals ac- 
cepting advertisements from individuals and organizations similar to those 
discussed. 

6. Pass along the information from it in any ethical, practical way to the 
public which has so often been victimized. 

7. Use of an occasional rereading of sections of the book to reawaken one’s 
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conscience and remind oneself that such abuses do exist, and that it is the pro- 
fessional psychologist’s duty to eliminate them. 

Robert Ammons. 

State University of Iowa. 

Kaplan, Oscar J. (Ed.), Mental disorders in later life. Stanford, Calif.: 

Stanford Univ. Press, 1945. Pp. vii+436. 

The disparity between the available research on the early segment 
of development (childhood), and the late (senescence), has long been 
obvious. The paucity of factual material available on mental decline 
has no doubt been disquieting to the teacher of developmental psychol- 
ogy who strove to present consistently tha entire sequence. Particularly 
is this void conspicuous when aging is viewed as the ultimate conse- 
quence of maturation and integration. Moreover, this period of the in- 
dividual’s growth assumes increased importance since more persons now 
live to be old, thus steadily increasing the ratio of old people to 
those in other age brackets. 

The editor modestly presents the seventeen chapters as a collection 
of essays and not ’’a full summary of the vast literature already avail- 
able in this field.” Although the title and general orientation of the 
volume emphasizes mental disorders, several of the chapters are good, 
up-to-date reviews of typical changes with aging. 

In these days of specialism it is heartening to find men of diverse 
scientific backgrounds collaborating on the treatment of phenomena of 
a limited field, even though this cooperative effort is additive rather than 
integrative. A brief, general introduction of the volume by a psychia- 
trist (Karl M. Bowman) is followed by a review of later life disorders 
by a statistician (Horatio M. Pollock). Well documented systematic 
articles on the physiological (Nathan W. Shock) and psychological 
(Harold E. Jones and Oscar J. Kaplan) aspects of normal and abnormal 
age changes follow. To complete the survey of older age processes the 
newer and highly significant sociological (H. Warren Dunham) and 
nutritional (L. Erwin Wexberg) approaches are added. 

Chapters on the Neuroses (Norman Cameron), the Involutional 
Psychoses (Eugene Davidoff), the Presenile Dementias (George A. Jer- 
vis), the Senile Psychoses and Psychoses with Cerebral Arteriosclerosis 
(David Rothschild), the A^ed Subnormal (Oscar J. Kaplan), and the 
Toxic Delirious Reactions (G. Wilse Robinson, Jr.), outline the clinical 
disorders of this age period with an emphasis on the recent literature. 
Psychotherapy (Fred V. Rockwell) and Mental Hygiene (Nolan D. C. 
Lewis) each merit a chapter in addition to sections on treatment in the 
clinical chapters. A systematic chapter ends the collection. It is written 
by a gerontologist (Edward J. Stieglitz) who has a broad perspective 
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on the phenomena of aging and its unsolved problems. The chapter 
does not, however, specifically articulate with those preceding it. Sand- 
wiched between the clinical types is a novel chapter describing a study 
in one institution of Older menial patients after long hospitalization 
(Eugenia Hanfmann). 

Like most volumes of this kind, the style, general outlines, extent 
of organization of facts and completeness of coverage of the materials 
vary greatly. The editor does not state the degree to which each writer 
was cognizant of the other contributors* outlines or manuscripts. How- 
ever, the overlapping and repetitions that occur do not detract from 
the book. 

The contributors differ in the extent to which they deal in mere de- 
scription and in hypothetical explanation. Some approach their phe- 
nomena from an atomic and some from a multiphasic standpoint. Wex- 
berg, in treating nutrition, finds it necessary to discuss employability- 
of the aged. In a discussion of neuroses, Cameron generalizes more as 
to the basic nature of psychological changes than do the authors of the 
chapter devoted to that subject. Shock integrates his detailed, schol- 
arly chapter, based upon 281 titles, and attempts to explain impair- 
ment of mental functions by the hypothesis that aging is attributed to 
progressive loss in the homeostatic capacities of the organism. Dunham, 
with less data, presents a framework to interpret the interplay of cul- 
ture upon the biological factors. 

This factual volume is a distinct contribution to abnormal and de- 
velopmental psychology. Dr. Kaplan is to be commended for his broad 
perspective of the problem, his appeal to specialists for data, and his 
excellent choice of contributors. However, with full cognizance of the 
enormity of the task and the potential vulnerability of the author, one 
does regret that a chapter has not been added to attempt to integrate 
all of the contributions. 

Fred McKinney. 

University of Missouri, 

Sherman, M. Intelligence and its deviations. New York: Ronald Press, 

1945. Pp. x+286. 

The first part of this brief text describes the nature of intelligence, 
mental growth, and the relation of mental growth to physical and en- 
vironmental factors. There follow brief discussions of intelligence in 
delinquency and psychosis, but most of the discussion of deviations con- 
cerns mental deficiency in its association with neurological disorders. 
There are biief sketches of intelligence tests, of genetic problems in 
intelligence, and of intellectual superiority. 

The treatment of the topics is very uneven. The author seems at 
home in the medical discussion of such conditions as cretinism, micro- 
cephaly, birth injury, epilepsy and hydrocephalus, and the information. 
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presented rather technically, will be of interest to the psychologist. 
There is a chapter on the adjustment of the mental defective which, 
while short, surveys a number of studies. As a general text, however, 
this book is not satisfactory. 

In a general discussion of intelligence and its deviations, which it is 
the author's stated intention to provide, one would expect a well- 
rounded development of the many facets that the subject possesses. 
Yet, in this very brief book, there is a very meagre treatment of most 
of the topics, and some important subjects are not mentioned at all. 
One feels that Professor Sherman wished to present the medical and 
neurological aspects of certain conditions which occur in some defec- 
tives, and then added material deemed necessary to prevent the book's 
having a rather limited and specialized appeal. 

There is a very brief and inadequate discussion of the rationale of 
intelligence testing, a knowledge of which is indispensable to the sys- 
tematic understanding of the field. There is very little mention of 
qualitative features of test performance, test patterns, problems of 
rapport, differences between diagnostic groups (some of which are never 
mentioned), or of many other significant and important topics. From the 
book, one would conclude that the only really significant deviation from 
normal intelligence is in the direction of deficiency. Seven pages are 
devoted to the intellectually gifted! In the chapter on intelligence and 
delinquency, the major problem raised is whether the average intelli- 
gence of delinquent groups is below that of other groups. Only eight 
pages are devoted to intellectual changes in psychoses other than the 
psychoses of mental defectives, and little of the extensive literature in 
this area is cited. Few of the basic problems studied arc made known. 
It is a sterile and rather futile picture of the study of intellectual devia- 
tion that is revealed in these pages. None of the studies of intellectual 
function of such investigators as Goldstein, Hanfmann and Kasanin, 
N. Cameron, J. Hunt (to mention but a few) is mentioned, and the 
whole problem of test performance in relation to lobectomy is ignored. 

The emphasis in presentation is well illustrated in the chapter on 
epilepsy, to which 19 pages arc devoted. Three pages are given to 
intellectual changes (deterioration). The nature of epilepsy, its age of 
onset, personality changes, and its pathology are described, and one 
derives a reasonably adequate picture of the disease. But this is a book 
about intelligence and its deviations. 

The discussions of general topics, such as theories, growth, and the 
like, arc poorly organized and will hardly leave the student with a clear 
idea as to what the problems are or what has been done in their solu- 
tion. Professor Sherman feels, for example, that IQ differences found 
in identical twins reared separately are an indication of “achievement 
levels" dependent upon different environments. This, of course, is the 
crux of the matter. But he then argues that such results cannot evi- 
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dence the influence of environment on intelligence, because tests are 
not culture free (253). Is the concept of intelligence itself culture 
free? 

There are a glossary of terms, mostly medical, and a bibliography 
of over 300 titles. It is perhaps significant that but seven titles bear 
publication dates later than 1939. 

Charles N. Coper. 

George Washington University, 

Brennan, Robert E. History of psychology, from the standpoint of a 

Thomist. New York: Macmillan, 1945. Pp. xvi+277. 

The author presents and*evaluates the history of psychology from 
the point of view of Thomas Acquinas, whose views in turn were based 
upon the writings of Aristotle. Nearly half of this condensed history is 
devoted to the antecedents, philosophic and otherwise, of psychology 
as a separate field. The remainder of the book is devoted to an accurate 
but very brief presentation of the important events, research, and per- 
sonalities in the development of modern psychology. The following is 
evidence of the brevity of the presentation: Wundt, and his achieve- 
ments, are presented in 4 pages, although there are references to his 
name or work on 18 other pages; William James is given 3 pages and 4 
other references; while Freud is given 13 pages and 9 other references. 

Historical sequence is carefully followed within the development of a 
topic, but the omission of dates limits the possibility of temporal cross 
reference from one topic to another. This is compensated for in part 
by the inclusion of a bibliographical index which gives the years of 
birth and death as well as listing the chief publication of the individual 
by title. 

The author's viewpoint is epitomized in the following paragraph. 

What we need today, as Acquinas would contend, is really less of psychology 
and more of anthropology, using the term “anthropology" in its traditional 
meaning to signify the study of man, as man, not as a concatenation of reflexes, 
or a sum of perceptual configurations, or a series of imaginal processes or a 
cornplexus of instinctive responses. Such things are simply isolated events in 
the history of human nature, and they have no meaning except in relation to 
the whole nature. Further, the study of man, as man, means the study of man 
as a besouled organism, or as a creature composed of matter and spirit, whose 
operations fall within the dimensions of scientific analysis, but whose funda- 
mental nature is the proper study of philosophy. (257) 

The research man will probably find the material of this book too 
condensed and the references too limited to serve his interests. Many 
an academic man will object to the philosophical and religious emphases 
as being too extensive for his effective use in the classroom. However, 
considering the religious and philosophic background of the author, 
most psychological readers will be gratified to observe the emphasis 
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placed on and the respect shown toward the established scientific observa- 
tions in all fields of psychological investigation. 

Wilson McTeer. 

Wayne University, 

Traxler, a. E. Techniques of guidance: tests, records, and counseling in 

a guidance program. New York; Harper, 1945. Pp. xiv+394. 

This comprehensive survey of methods and procedures for use in 
guiding and counseling high school students is written for the benefit 
of high school teachers and administrators. The book includes (xiii) 
“much detailed explanation concerning tests and other instruments of 
evaluation and a large number of illustrative record forms,” written 
to advise school personnel who may have had little psychological train- 
ing. Detailed and pertinent suggestions on a variety of associated ad- 
ministrative problems are also provided. Such a book is to be evaluated 
not as a scientific treatise but as a guide to the proper application of 
psychological techniques and procedures. 

Apart from a brief discussion of occupational opportunities early in 
the book and an equally brief treatment of “adjustment” problems at 
the end, the primary concern is with problems of distribution or place- 
ment of students within the educational system. The book is organized 
around the individual cumulative record, which is to include information 
obtained periodically for each student in the following ten areas: home 
background, health, out-of-schooI experiences, plans for the future, 
school history and record of class work, mental ability or academic apti- 
tude, special aptitudes, achievement and growth in different fields of 
study, educational and vocational interests, and personality. Informa- 
tion in the first four areas can be obtained from each individual by 
interview or questionnaire, and school records accumulate more or less 
automatically. 

Information in the last five areas involves the use of tests and ob- 
servational procedures, and it is here that a detailed explanation of 
methods of obtaining and applying such information is needed. A mini- 
mal explanation of the applicability of a measure in a guidance situa- 
tion should include a statement of the validity of the measure in pre- 
dicting scholastic, vocational^ or other performance, so presented that 
a statistically unsophisticated person could grasp the actuarial nature 
of the problem. This we are nowhere given. 

The chapter on the appraisal of aptitudes describes or mentions a 
number of standard tests of intelligence or general academic aptitude 
suitable for use with high school students. A carefully prepared anno- 
tated list of thirteen such tests, which is presented at the end of the 
chapter, gives for each test the author (s), publisher, cost per copy, 
mention of the component parts, and reliability coefficients. For only 
one of the thirteen tests is validity in relation to scholastic performance 
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mentioned at all. Since tests of this type do have substantial validity 
for many purposes, and considerably less validity for many others, it 
would seem that a more intensive consideration and explanation of this 
subject would lead to a more discriminating use of such tests. 

Under “special aptitudes,’* similar information is given for tests of 
art, music, science, mechanical, clerical, mathematics, and foreign lan- 
guage aptitudes. Counselors are warned that the label of a specific 
aptitude test does not always indicate accurately what the test actually 
measures, but methods of choosing the most effective of several similarly 
named tests are not provided. 

A chapter on the evaluation of achievement presents an exhaustive 
annotated list of achievement tests in all scholastic areas. 

In presenting a catalogue of personality tests, questionnaires and 
interest inventories, the author is carefully critical of some of the pro- 
cedures reviewed; others are simply mentioned or described briefly.* 
(107) “Although there have been many studies of personality tests, the 
validity of the various tests in this field has not been established.” 
Thus the precautionary statement earlier (100), “Before introducing a 
new test for measuring personality, a school will do well to try it out 
experimentally with a few pupils to see whether or not it contributes to 
the description of the personality of the individual pupil.” 

Several rating scales are described with a caution about “halo effect.” 
Much more attention is given to the “anecdotarrecord,” a systematic 
collection of anecdotes “concerning some aspect of pupil behavior which 
seems significant to the observer.” No attempt is made at a statistical 
evaluation of this anecdotal procedure, either as to inter-observer re- 
liability or as to the prognostic value of various types of behavior 
samples. 

In a sepaA'ate chapter, a large number of uses to which objective 
test results may be put are described. A discussion and proposed solu- 
tion of the validity situation is given as follows (199): 

Research is the only means by which the prognostic nature of a test can be 
adequately investigated, but research techniques are often slow and expensive 
and not enough time has elapsed since the publication of the majority of the 
good tests for research to discover their relationships to fundamental long-time 
objectives. . . . Personnel workers in schools, however, cannot wait until re- 
search with tests has supplied answers to many of their questions, for they must 
meet today the problems of guiding their present students. Nor is it necessary 
for them to wait, for they can read a great deal of meaning into the scores on 
the basis of their own experience. 

l‘he reviewer believes that if some of the effort being spent on read- 
ing meaning into test results could be channeled into research on their 
relationships to objectives, the number of areas in which we read mean- 
ing /r<7W rather than into scores could be increased. 

Merrill Roff. 

Indiana University. 



BOOK REVIEWS 185 

Yale, J. R. Frontier thinking in guidance, Chicago: Science Research 

Associates, 1945. Pp. 160. 

This book is an anthology of 24 articles in the field of guidance. All 
except one of the papers has appeared elsewhere and are reproduced in 
this volume because, in the opinion of the editor, they “are selections 
written fairly recently which the well-informed person should know or 
know about” and are “recommended reading of definite interest to the 
guidance person.” Each of the 24 contributors (13 of whom are psy- 
chologists) is a recognized authority and has made significant studies in 
this field. 

There are six parts to the book: Part one. Education prepares to 
increase guidance, contains a discussion of needed reforms in the sec- 
ondary school system (Reeves), and presents proposals of the National 
Resources Planning Board (Blaisdell) for increased federal support and 
of the New York Regents (Wilson) for increased state support of second- 
ary education; Part two. Guidance comes of age, contains articles on the 
genesis of modern guidance (Paterson), the growth of occupational 
testing (Borow), the relation between guidance and instruction (Brewer), 
evaluating the effectiveness of counseling (Wrenn and Darley), the im- 
portance of guidance in the elementary school (Addy) and includes a 
summary of the statement prepared by the ACE Committee on Prob- 
lems and Plans in Education on student personnel services; Part three. 
School guidance programs in operation, discusses plans developed in 
Glencoe, Illinois (Kawin), Philadelphia (Boyer) and San Francisco 
(Schmaelzle) ; Part four. Undertaking readjustment of the veteran, presents 
material pertinent to the counseling of veterans in the post war period 
(Williamson), accrediting veterans returning to school (Jacobson) and 
two states — Connecticut and Michigan — veterans’ programs (Gray, 
Fern, Horn); Part five. Tools for the guidance worker, includes articles 
on non-directive counseling as an effective technique (Rogers), a survey 
of occupational filing systems (Yale), occupational analysis activities 
in the War Manpower Commission (Shartlc and Dvorak), a critical 
analysis of textbooks surveying the field of guidance and student per- 
sonnel (Fredenburgh) and a minimum library for counselors (Smith); 
and Part six. The counselor obtains additional training, contains sections 
devoted to laboratory training for counselors (Lloyd- Jones), selecting 
and training teachers — in Fargo, N. D. — for individual guidance (Froeh- 
lich) and short term training programs for counselors (Jaqua). 

Obviously with such a miscellaneous collection of topics it is difficult 
to arrive at an adequate overall evaluation. The articles were prepared 
quite independently of any thought of inclusion in the volume under 
review, so duplicate each other in minor ways and do not of course con- 
stitute a systematic treatment. Certain of the articles are critical in 
content whereas others are merely descriptive. They are interesting but 
undoubtedly many similar volumes of completely different articles could 
have been prepared, so one may ask, why this particular collection? 
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There is no adequate discussion of this point, nor is there any introduc- 
tion or summary that brings together the points which the editor may 
have thought significant. It would seem that if the separate articles are 
worth presentation in a single volume (and the reviewer is not doubting 
this) they should also be worth an adequate summarization in some 
chapter designed to integrate the ideas involved. If the title were less 
pretentious and something like “A few readings in the field of guid- 
ance” the contents would be much more accurately described. This 
is not to deny the editor’s contention, however, that some of the articles 
included do represent significant and critical thinking in the sense which 
the title implies. 

Leonard W. Ferguson. 

Field Management Divisidn 
Metropolitan Life Insurance Company 
New York 10, N. Y. 

Dunsmoor, Clarence C. (Chairman) for New York State Counselors 
Association. Practical Handbook for counselors. Chicago: Science 
Research Associates, 1945. Pp. 160. 

Although this handbook was prepared to assist high school coun- 
selors, it will be of interest and value to high school teachers and coun- 
selors, college and university counselors, teachers college faculty who 
teach courses in guidance or school administration, and to school ad- 
ministrators or boards who are responsible for authorizing guidance 
programs and selecting counselors. 

An outstanding merit of the handbook is the comprehensiveness of 
the guidance program which it outlines. It outlines a program extend- 
ing from the elementary school through follow-up studies of post-school 
adjustments and including educational and vocational guidance, psy- 
chological testing, personal adjustment counseling, study methods, and 
orientation to the school. Its major deficiency is to be found in the area 
of actual guidance methods. That is, the handbook presents an excel- 
lent outline of the administrative procedures, tools, and techniques of 
collecting data to be used in guidance but does not go very far in telling 
the counselor how to use the data in guiding the individual. For ex- 
ample, several psychological tests are listed but very little comment is 
made on how to interpret or use the test results; several study skills 
are listed but no suggestions are given as to methods by which they can 
be developed; and the techniques and content for a case study are out- 
lined but no suggestions made as to how to use or interpret the data 
collected. It appears that the authors assumed that the counselor who 
uses the handbook has been trained in guidance methods and psychol- 
ogy but c^n profit by a comprehensive review of the scope of work that 
he should include in his guidance program. In spite of this deficiency, 
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the handbook deserves a distinctive position in the literature regarding 
guidance programs in the elementary and secondary schools. If a school 
system organized a guidance program of the scope outlined in this hand- 
book and secured adequately trained personnel to do the guidance work, 
it would have an excellent program. However, one administrative ar- 
rangement the handbook omits is for the counselor to have access to 
psychological or psychiatric assistance in the diagnosis and treatment 
of behavior problems, nor does it recommend that the counselor be 
trained in psychological therapy, abnormal psychology, or mental 
hygiene. 

The handbook contains eighteen chapters, which are very brief, 
averaging seven pages per chapter. The chapters cover such topics as: 
the role of the counselor, the objectives, sefope, and content of the guid- 
ance program, technique of the counseling interview, the maintenance 
of cumulative records, tests of abilities, achievement, interests, and 
personality, content and techniques of the case study, group activities 
as a part of the guidance program, orientation and adjustment to the 
school, course selection and program planning, vocational guidance, 
placement program, preparation for college, financing a college educa- 
tion, follow-up studies of post-school adjustments, the counselor’s per- 
sonal qualifications, the counselor’s relationships with other persons and 
agencies, the counselor’s tools, and the counselor’s professional develop- 
ment. 

Chapter 17, the Counselor's professional tools ^ contains an excellent 
bibliography of books, journals, and other publications dealing with 
youth problems, guidance techniques, specific phases of guidance, and 
occupational and educational information. At the end of the handbook 
is a bibliography of additional books on various phases of guidance and 
a list of agencies which establish the qualifying requirements for various 
occupations and professions. 

Wilbur S. Gregory. 

University of Nebraska, 

Triggs, Frances O. Personnel work in schools of nursing, Philadelphia: 

W. A. Saunders Co., 1945. Pp. xv+237. 

The * ‘personnel point of view” is doubtless as applicable and as effi- 
cacious in the field of nursing* training as in any other area of education. 
In fact, a very considerable amount of work has been done in this area, 
as is evidenced by the bibliography of 354 titles included in the present 
book. The book itself is in the nature of a general survey of the field 
of student personnel work with particular reference to the problems of 
student nurses. 

The general treatment of the topic is rather eclectic with a generous 
infusion of the “newer approach to psychotherapy” sponsored by Rogers 
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and by the Elliotts. The consequences of this eclecticism are evident in 
a conflict between statistical objectivity and clinical case-work. 

The book starts with a sketchy exposition of Human Behavior and 
Adjustment which runs the gamut of physiology (the autonomic nervous 
system and the endocrines) emotion and motivation, learning and con- 
ditioning and the psychology of adjustment — all in 23 pages. This is 
followed by five chapters on The Counselling Program (educational, per- 
sonal and vocational), and three chapters on testing and personnel 
records. The final chapter, entitled The Story of a Student, sets forth a 
complete case history in some detail. 

The treatment is rather uneven. The sections on counselling prob- 
lems are written in a style that is clear and straightforward, if somewhat 
too cursory. In handling the* more technical topics of testing and statis- 
tical analysis, however, the author often descends to a “popular” level 
which is confusing at best. 

Altogether, the book presents a sympathetic and (for the most part) 
readable account of the work of the counsellor in a school of nursing 
which should serve as a stimulus to “start the prospective professional 
counsellor ... on the road to her technical training.” Its major weak- 
ness lies in its failure to provide a rational basis for the predictive process 
and its reliance on an uncritical amalgam of any and all information 
arrived at through the somewhat mystical medium of the counselling 
interview. 

E. Donald Sisson. 

Personnel Research Section, AGO 

New York, N, F. 
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THE ‘TAWS” OF RELATIVE VARIABILITY OF MENTAL 

TRAITS 

ROBERT S. ELLIS 
Pomona College 

The purpose of this paper is to survey the present status of our 
knowledge of the relative variability of mental traits and especially to 
examine critically the various generalizations and “laws” that have 
been proposed by writers on this subject. 

The usually accepted method of computing relative variability is 
to use the formula devised by Karl Pearson for the Coefficient of Vari- 
ation (V), according to which V = 100 S.D./M, We divide the standard 
deviation (S.D.) of the distribution by the mean (M) of the distribu- 
tion to two decimal places and by multiplying by 100 we eliminate the 
decimal. This makes it possible to compare the variabilities of charac- 
teristics which have been measured in different units such as pounds, 
inches, seconds and the unit scores of some psychological tests. It also 
makes an allowance for differences in averages when measurements are 
in the same units. 

If we wish to apply the above formula to psychological test results 
we encounter in many cases a serious difficulty which is overlooked in 
some discussions of the subject. In order to use the V formula for most 
comparisons the measurements used must be arrived at by using scales 
with true zeroes and the scales must be in approximately equal units. 
It is especially important that V scores not be calculated for error scores 
or time scores which decrease as performance improves. In these 
cases zero scores would mean perfect performance rather than the 
absence of any ability. 

Unfortunately the zeroes of psychological tests are commonly not 
true zeroes of ability. As a result the mean is usually too small, so that 
when we divide S.D. by M, the resulting V is too large. If then we com- 
pare V’s on two tests which have zeroes at relatively different distances 
from their true zeroes the results cannot be interpreted accurately. 
This error is most serious when dealing with higher and more complex 
functions such as memory and reasoning. 
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While the Pearson method of computing relative variability has been 
generally used, other methods have had supporters. Thorndike (86, 
p. 9) suggested that the square root of M be used instead of M in the 
Pearson formula. This change increases relative variability in some 
cases where the use of M gives decreases. Yule (109, p. 49) suggests 
that deviations be compared by determining their ratios to the geo- 
metric mean. Wechsler (98) after eliminating pathological extremes 
determines the ratios of the highest to the lowest measurements. This 
method seems to have been used because of Wechsler’s hypothesis that 
the ratio of the highest to the lowest measures was limited by the mathe- 
matical constant e (2.718..). This, however, does not seem to hold for 
some simple sensory functions. Woodrow (105) changes raw scores to 
sealed scores and compares S.D.’s to determine the effect of practice. 
Peters and Van Voorhis (67, pp. 78 ff.) throw the baby out with the 
bath water: they suggest that the zeroes of all distributions be placed 
about 3 sigmas below the mean where the scores begin to diverge. In 
normal distributions this would make all V's equal to about 33 and 
hence useless for comparison. 

These differences in method are important because the conclusions 
reached may depend on the method used. However, since most students 
accept the Pearson formula we shall use it as a basis for further dis- 
cussion. 

Historical Background 

The earliest suggestions of laws of variation were, quite naturally, 
made by biologists, but some of these have psychological implications 
and should be mentioned. 

In 1809 Lamarck (48) opened the question as to the natural causes 
of variations by advancing his well-known idea that organisms vary in 
response to environmental stimulation and m a way to meet environ- 
mental requirements. Characteristics become stronger through use or 
weaker through disuse, and these changes arc transmitted to offspring. 

Fifty years later Darwin (12) included a chapter on “Laws of 
Variation” in The Origin of Species. He accepts the Lamarckian idea 
about use and disuse but places less emphasis on it and emphasizes 
variation from other “ . . . causes of which we arc quite ignorant.” 
Among his conclusions arc these: rudimentary organs are more variable 
than ordinary organs; highly developed organs are unusually variable; 
specific characters are more variable than generic characters; and lower 
organisms are more variable than higher organisms. 

Hav'^lock Ellis (22), in 1894, developed and emphasized the idea 
that men arc more variable than women. The evidence for this claim 
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rested largely on physical characteristics. Three years later, Pearson 
(66) examined at some length the problem of sex differences in varia- 
bility in physical traits. He concluded that evidence was lacking for 
greater male variability. Pearson also concluded from the study of 
cephalic indices that civilized races are relatively more variable than 
primitive races. 

In 1902 Vernon (95) published a very important biological study of 
variation with incidental psychological references. This was based on a 
careful statistical treatment of measurements. After studying varia- 
tions in physical growth he formulated a tentative law as follows: 
“The variability of a developing organism diminishes regularly with its 
growth’* (p. 206). Minot and Pearson, working independently, appar- 
ently did the most important work which furnished the basis of this 
law. Vernon agrees with Minot that this reduction in relative variabil- 
ity during growth probably occurs in all mammals. Psychologists 
naturally will ask: Is this true for mental traits? 

Vernon also discussed the possible influences of genetic selection 
and of natural selection on variability. In both cases, relative variabil- 
ity would be decreased. 

In 1895 Binet and Henri (3, p. 417) after reviewing studies made up 
to that time in the psychological field conclude: 

Among the results which emerge from all these studies we shall cite a few: 
The first, the most important of all, we believe, is that the higher and more 
complex a process is the more it varies from individual to individual: sensations 
vary from one person to another, but less than memory, memory of sensation 
varies less than memory of ideas, etc. It follows from this then that if one wishes 
to study the differences between two individuals it is necessary to begin with 
the highest and most complex processes and it is only secondarily that we need 
to consider the simple and elementary processes: this is, however, the opposite of 
what has been done by the great majority of authors who have treated this 
question. (Translated by the writer.) 

This statement was made of course before the statistical treatment 
of mental measurements had become common. It is probably based 
primarily on observations of qualitative differences. 

I have not seen the first.edition of Stern’s Differ entielle Psychologic 
which appeared in 1900 but in the third edition (83, pp. 257 ff.) he 
accepts the conclusion of Binet and Henri that complex traits are more 
variable than simple ones. To this he adds the idea that there is a 
positive correlation between the time of appearance (phylogenetically 
and ontogenetically) of a trait and its variability. Hence traits appear- 
ing in adolescence would be more complex and more variable than 
those appearing in infancy. He expresses the idea that relative varia- 
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bility should be determined by finding the percentage which a deviation 
is of the mean. 

A study made by F. L. Wells seems to have suggested to Stern the 
idea of comparing the variability of traits within an individual with the 
variability of a single trait in a group of individuals. He concludes 
that higher processes will vary more than lower processes in a single 
individual. Presumably this would mean, for example, that mathe- 
matical ability and linguistic ability would show more independent 
variability in the same person than would auditory and visual acuity. 

Thorndike (87, p. 317) formulates two tentative conclusions about 
relative variability as follows: 

1. The variations are, in general, greater in acquired than in original traits. 

2. They are, in general, greater in traits peculiar to man than in traits 
characteristic of all mammals. 

Studies of sex differences are reviewed for evidence of a difference in 
relative variability and Thorndike concludes that women are probably 
slightly less variable than men (pp. 193 ff.). Several studies made in 
the period from 1908 to 1914 on the effects of practice on individual 
differences are reviewed (pp. 304 ff.) but the results arc inconclusive. 
However, he at least formulates the problem. 

H. L. Hollingworth (41, pp. 74-84) discusses the problem of rela- 
tive variability and concludes that variability is probably greater for 
traits that are more complex, more functional, more recent (phylo- 
genetically or ontogenetically), more specific, more symbolic, less used 
and less relevant. 

Hull (42) makes a study of the question of variations in traits within 
the individual. He calls this ‘Trait variability*' and reaches a tentative 
conclusion that such trait variability is about 80 per cent of the vari- 
ability of single traits in the population. 

Weclisler (98) raises the question as to the range of human capaci- 
ties. After eliminating the pathological extremes, he determines the 
ratio between the measurements of the highest and the lowest mem- 
bers of a group and draws the conclusion that with very few exceptions 
this ratio does not exceed 3.0:1 (p. 62). Wechsler emphasizes the point 
that while individual differences are real and are important they are 
not nearly as great as is commonly assumed. He believes this conclusion 
has an important bearing on the question of democratic government 
and on related social and economic problems. 

Wechsler also subscribes to the idea that complex traits are more 
variable “ . . . since the variability of any given phenomenon is neces- 
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sarily the function of the product of the variabilities of the individual 
factors which determine it** (p. 59). 

This brief review will suffice to indicate the chief points in the his- 
torical background of current ideas of relative variability. Before 
attempting to pass judgment on the accuracy of these and other gen- 
eralizations it will be useful to have a look at the problem of variation. 

The Mechanism of Variation 

Most normal psychologicnl characteristics show continuous varia- 
tion from the very low to the very high degrees of the trait. Geneticists 


TABLE I 

Tite Relative Variabilities (10 Of Distributions Resulting From Chance 
Combinations Of Coins And Dice 



S.D. 

M 

V 

S.D. 

M 

V 

4 Coins 

9 Coins 

1 

6 

17 

1.5 

13.5 

11 

4 Dice 

9 Dice 

3.416 

14 

24 




Theoretical 




5.123 

31.5 

16.3 

Empirical 




5.073 

32.3 

15.7 


Heads and tails on the coins are counted as l*s and 2*s respectively Four coins falling 
three head? and one tail give a total of five. 

The S.D.* of the first N natural numbers, 6 in case of dice and 2 in case of coins as 
used above, equals (A*— 1)/12 (Yule, 109, p. 143). When we add equipolent independ- 
ent variables the S.D.* varies as the number (n) of the variables. Hence the S.D. of the 
distribution resulting from tossing n dice or coins is given by the formula: S.D.* 
= w.(A>-l)/12. For nine dice this would be: S.D.**9(6»-1)/12 =26.25. S.D. = 5.123. 
As an empirical check, nine dice were tossed 200 times with re.sults as given under 
“Empirical." 

usually interpret this to mean that differences in the strength of in- 
herited characteristics are due to multiple factor inheritance: the 
strength of the characteristic is determined by the joint influence of 
a number of genes. This hereditary potential would be affected by 
environmental factors to a greater or lesser degree, so that the final 
result would depend on both kinds of factors. 

In order to get examples of variation under conditions where we 
understand clearly the material with which we arc dealing, some re- 
sults, both theoretical and empirical, from certain chance combinations 
of coins and dice are presented in Table I. 
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When two factors combine to produce a more complex total the 
combination may be an additive process or it may involve multiplica- 
tion with the result a product of the elements. Psychologists usually 
accept the additive assumption. The addition of normal distributions, 
as in case of our dice problem, gives a normal distribution while prod- 
ucts give a skewed distribution. Since distributions of complex psycho- 
logical traits are commonly approximately normal this seems to justify 
the tentative use of the additive method. The difference between these 
two assumptions, as we shall see later, is vitally related to the problem 
as to whether simple or complex traits are relatively more variable. 

The “Laws** of Relative Variability 
Complexity vs. Simplicity 

If we examine Table I we find that, whether we are dealing with 
coins or dice, as we pass from the totals for four variables to nine vari- 
ables the S.D.*s increase and the V’s decrease. This happens because 
the M*s increase in proportion to the number of variables while the 
S.D.*s increase in proportion to the square root of N. On this basis 
tlien an increase in complexity results in an increase in absolute vari- 
ability but a decrease in relative variability. 

If, however, we compare the totals for four coins and four dice we 
have the same number of variables in each case, but with an increase 
in the variability of the variables, and under these conditions, both the 
S.D.’s and the V*s increase. 

On this basis if we assume intelligence to be made up of capacities 
for perception, memory, reasoning and such other behavior as the 
reader wishes to add, the value of V for intelligence should be lower 
than the V*s of its components. In another population where intelli- 
gence has the same components but where the components themselves 
varied more widely in strength the relative variability of intelligence 
would be greater than in the first population. 

According to accepted statistical principles we may administer two 
tests to the same group of subjects and after computing the M*s and 
S.D.*s of the two tests separately we may determine the mean of the 
sum of the two tests simply by adding the means of the two separate 
tests, and we may obtain the S.D. of the summed distribution by using 
the formula for that purpose (109, p, 211). From this it follows that 
the value of V for the sum of any two tests can never exceed the higher 
of the two values of V for the two tests which were combined — provided 
the test Scores were of such a character that it w'as legitimate to compute 
a V in the first place. In this sense it is a statistical impossibility to in- 
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crease the value of V by combining test scores. If then complex traits 
are considered to be the sum of simple traits, they must be relatively 
less variable. 

TABLE II 

Coefficients Of Variation (7) For Different Characteristics 


Trait 

Subjects 


Weight of spleen (6) 

54 normal males, 30-40 years old 

451 

Olfactory acuity (4, 34) 

Assorted adults 

30+* 

Visual acuity (20, 47, 77) 

805 17-20 yr. olds plus others 

27+ 

Strength of back (36) 

Men 1 7-30 years old 

22* 

Simple visual reaction time (29, 35) 

5564 English adults plus others 

211.4 

Strength of grip (106) 

609 16 year old boys 

2H 

Weight of liver (33) 

73 English adults 

20» 

Auditory acuity (68) 

61 young adults 

19+ 

Viral capacity (106) 

599 16 year old boys 

19‘ 

Card sorting (72) 

148 16 year old girls 

17 

V'isual perception span and speed (58) 

98 males, 18-29 years old 

15* 

Memory span, digits (98) 

236 male adults 

15* 

Tapping (106) 

615 boys, 16 years old 

13‘ 

Highest audible pitch (47) 

805 17-20 year olds 

12 

Body weight (43) 

U. S. soldiers in 1917 

12 

Weight of cerebrum (65) 

308 English adult males 

11 

Brain weight (65) 

416 English adult males 

9 

Cranial capacity (51, 65) 

English adult males 

8 

Knee height (43) 

U. S. soldiers in 1917 

8 

Leg length (43) 

U. S. soldiers in 1917 

7 

Stature (43) 

U. S. soldiers in 1917 

4 

CAVD intelligence (88) 

Assorted 

3» 

Botly temperature (102) 

601 convicts 

0.5* 


^ M, S.D. and V calculated by the writer. 

* V estimated by the writer from percentage distribution. 

* W calculated by the writer from percentile distribution. 

^ Based on reciprocals of time scores. 

V calculated by the writer. 

If we turn from statisticctl theory to empirical data to test our con- 
clusion we confront the fact that most intelligence tests do not have 
true zeroes. Hence we cannot compute a true value of V for them. 
An exception is found in the CAVD test worked up by Thorndike and 
others (88). The mean adult score on this test is given as 36.5 and the 
S.D. is 1. This gives a V of 2.74. By examining Table II it will be seen 
that this is comparatively a very low value and agrees fully with 
theoretical expectations. 
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Even if we make a very large allowance for error in Thorndike’s 
determination of the zero of the CAVD scale and multiply the value of 
V by 3 it would be only 8.2, which is relatively a low value, 

Thurstone (89) also has attempted to determine the true zero for 
mental tests. His method is based on the assumption that relative 
variability of intelligence remains constant during growth. On a priori 
grounds the writer questions this assumption because all of the results 
he has seen for physical, psychomotor and sensory traits measured by 
objective scales with true zeroes show variations in V’s with age. Under 
these conditions it would be very surprising if the V’s for intelligence 
did not change with age. In the second place Thurstone’s results arc 
not consistent. One application of his method to Binet Test results 
(89, p. 196) gives a V of 7.2 (M = 13.9, S.D. = 1) while another applica- 
tion (90, p. S74) gives a V of 37.6 (M =2.66, S.D. = 1). This diflf(Tence 
can be only partly accounted for by the difference in the variability of 
the two groups tested. The growth curve in the first instance is a con- 
ventional negatively accelerated growth curve similar to those usually 
found. In the second instance the curve is first positively accelerated 
and later is negatively accelerated. These disagreements would seem 
to afford ample basis for questioning the validity of Thurstonc’s 
method. 

Ability as represented by a total score is certainly more complex 
than the abilities represented by the separate test scores. Yet the V’s 
for total scores, as noted above, are regularly lower than the V’s of at 
least some of the individual tests. An example of this may be had from 
the VACO test results reported by Freeman and Flory (25, p. 42) for 
children aged 13 years. They report V’s as follows: vocabulary 20.3, 
analogies 23.9, completion 20.3, opposites 27.9, and total 18.5. 

If we consult Table II we find that with the exception of the weight 
of the spleen the highest V’s are for olfactory and visual acuity, strength 
of back and of grip, and reaction time. These are hardly what psycholo- 
gists would usually call the most complex traits. Visual perception span 
and memory span for digits are near the middle of the list. 

Additional evidence that simple functions arc relatively more vari- 
able than complex functions is provided by some studies not included 
in Table II. 

Leukhart (49) gives the times for monocular accommodation for 
14 subjects. Since higher time scores mean slower and poorer accom- 
modation I ha\ e calculated the value of V for the reciprocals of these 
time scor .s and find it to be 40.6. Warden, Brown and Ross (96) de- 
termined the threshold for motion acuity for 28 subjects at scotopic 
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levels of illumination. This is stated in terms of angles and the lowest 
scores indicate the greatest visual acuity. For this reason I have again 
determined the V based on reciprocals and find it to be 147. Since one 
extreme case is responsible for a large part of this value, I determined 
V for the remaining 27 cases and found it to be 74 — still a very high 
value. In both of the above studies the number of cases is rather small 
and the sampling is limited to groups of generally superior young adults. 
However, when we find wide variability under such conditions it ap- 
pears improbable that we should not find it with a larger and less 
selected sample of the population. 

Hermans (38) had 100 subjects observe the size of a standard 100 
nim. aperture with binocular, with monocular and with pinhole vision 
and attempt to match this by adjusting a different aperture with 
binocular vision. Table III gives the results. M's and S.D.’s are from 
Hermans, V's have been added by the writer. Relative variability in 
responses is clearly greater with the simpler pinhole vision and less with 
the higher and more complex binocular vision. 


TABLE III 

Relative Variabilities Of Judgments Of Size At Three Levels Of 
Vision (Hermans, 38) 



M 

S.D. 

V 

Pinhole vision 

67.25 

23.05 

34 

Monocular vision 


14.30 

15 

Binocular vision 

104.74 

6.54 

6 


McGeoch (52) used three groups of nonsense syllables of different 
levels of association value, as determined by Glaze, and added a group 
of three-letter words. The 00% syllables are those for which subjects 
reported no associations, the 53% syllables are those for which 53 per 
cent of the subjects reported associations, and so on. McGeoch de- 
termined the amounts of these that were learned in a given period of 
time. Table IV gives the results. M’s and S.D.’s are from McGeoch. 


TABLE IV 

The Relation Of Relative* Variabilities To The Associative Value Of 
Syllables (McGeoch, 52) 


Material 

M 

S.D. 

V 

3-letter words 

9.11 

1.12 

12 

100% syllables 

7.35 

1.96 

27 

S3% syllables 

6.41 

2.37 

37 

00% syllables 

5.09 

2.60 

51 
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V*s are added by the writer. This table shows that as material be- 
comes more meaningful, and hence presumably involves more complex 
learning functions, there is relatively less variability in the amount 
learned. 

While the zeroes of these scores may not be true zeroes, that can 
hardly account for the obtained differences. Unfortunately it is not 
possible to extend this type of analysis to a study of higher levels of 
rational memory because of the fact that nothing approaching a suitable 
measuring instrument is available. 

In the psychological field the evidence certainly docs not support 
the generalization that complex traits are relatively more variable — 
rather the opposite. This conclusion is further supported by considering 
anatomical measurements. 

Some of the internal organs such as the spleen and the liver are 
relatively the most variable. If one questions these results because of 
admitted difficulties in securing a satisfactory selection of human bodies 
to dissect, it may be noted that the same relation is found for laboratory 
animals. Thus Donardson (16, p. 225) reports V’s for weights of the 
rat as follows: brain 10, body 19, heart 24, liver 25, thymus 34, and 
ovaries 43. As in man, body weights show less variability than is 
shown by some of its parts. The brain, which is hardly a “simple” 
structure, is relatively much less variable than the internal organs such 
as the spleen and the liver. Likewise, in Table II, consider the series — 
stature, leg length, and knee height. Stature is more complex and rela- 
tively less variable. Similarly the cerebrum is relatively more variable 
than the whole brain and the latter is relatively more variable than the 
brain case. 

In contrast we may note that body weight is relatively more vari- 
able than stature. As a matter of geometry the body is a solid and has 
three dimensions. Similar solids vary as the cubes of their single 
dimensions — and cubes are products of three dimensions. In our dice 
and coin illustration we worked with sums and the results seemed to 
agree with the psychological test results. If, however, we deal with 
products the result is different and then Wechsler, as quoted above, 
would be right when he says that complex characteristics arc relatively 
more variable “since the variability of any given phenomenon is neces- 
sarily the function of the product of the variabilities of the different 
factors which determine it” (98, p. 59). 

While this generalization is true when we compare stature and 
weight it docs not appear to be true when we compare psychological 
characteristics. Psychological test results agree better with the addi- 
tive assu mption usually made, and, as I shall point out later, the as- 
sumption of subtraction is helpful in understanding certain changes in 



RELATIVE VARIABILITY OF MENTAL TRAITS 


11 


relative variability associated with old age, fatigue and forgetting. 
Certainly there is no good psychological, biological or mathematical 
reason why we must assume that psychological variation is always 
and '‘necessarily” a function of products rather than of sums or dif- 
ferences. 

Functions vs. Structures 

Hollingworth’s second law states that functions are more variable 
than structures. From Table II we may sec that the lowest V’s arc for 
functions while the highest is for a structure. With advancing age rela- 
tive variability of visual acuity climbs rapidly and is above 90 by the 
age of 60 (77, p. 85). It might be difficult to match this with an example 
of structural variability. But in young adults and in children several 
structures arc relatively more variable than visual acuity. Brain weight 
is relatively more variable than CAVD intelligence. A very wide 
range of V’s can be supplied for both structures and functions, hence it 
would appear that this “law” hardly agrees with the facts. 

Symbolic vs. Concrete 

Since CAVD intelligence is distinctly symbolic and since the ability 
to handle symbols is presumably complex it hardly seems necessary to 
discuss this “law” separately. 

Recent vs. Ancient 

Stern, Thorndike and Hollingworth have supported the idea that 
traits recently acquired — whether phylogenetically or ontogcnctically — 
arc more variable than those of more ancient vintage. This seems to be 
in conflict with the principle supported by Vernon that the variability 
of a developing organism decreases with growth. Also it appears to be 
in coullict with the fundamental biological theory of evolution. Thus 
Shull (81, p. 243) writes as follows (italics are my own) : 

The similarity of the .species of a genus is held to indicate kinship, but since 
there is greater diversity among the individuals of a genus than among the members 
of a species, the common stock from which the species of a genus have sprung 
must have existed at an earlier time, in order that evolution could bring about 
the degree of divergence now observed. 

Reptiles and low'cr animals arc “cold-blooded” becau.sc they lack 
our temperature regulating mechanism. Yet our body temperatures 
show much less relative variability than various “older” functions. 
Also body temperature is less variable in adults than in children. The 
higher levels of CAVD intelligence arc late in appearance both in the 
race and in the individual and show relatively small variation. The 
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sense of smell was originally the dominant distance receptor and the 
cerebrum first began to develop as an olfactory correlation center. Yet 
at the present time and in young human adults olfactory functions 
appear to be relatively more rather than less variable than visual and 
auditory functions. Auditory acuity develops phylogenetically after 
visual acuity and seems to be relatively less variable. Musical talent 
can hardly be said to appear early in the race but it does appear early in 
childhood and it seems to be more variable than abstract intelligence. 

This generalization that traits more recently acquired — whether 
individually or racially — are more variable than older traits is probably 
a deduction from the supposedly greater variability of complex func- 
tions. It appears to be equally erroneous. 

Specific vs. Generic 

Darwin says: *‘It is notorious that specific characters are more vari- 
able than generic.** And . the points in which all the species of a 
genus resemble each other, and in which they differ from allied genera, 
arc called generic characters.** But since the nearest relatives of Homo 
sapiens are known only through fossil remains and since biologists are 
not agreed (Dobzhansky, 14) on the classification of these fossils into 
genera and species it seems unwise to attempt to discuss this principle 
as Darwin has defined it. 

Hollingworth’s statement of the principle is that specific and less 
widespread traits are more variable than those that are more generic. 
Interpreted in this way it means about the same thing as Thorndike*s 
second law that variability is greater in traits peculiar to man and as 
Pearson’s claim that civilized man is relatively more variable than 
primitive man. It is also related to the question discussed above under 
“Recent vs. Ancient.** 

Among the important differences between man and our nearest 
surviving relatives, the gorilla and the chimpanzee, are man*s upright 
posture, longer legs than arms, more uniform teeth, virtual absence 
of hair from most of the body, smaller face, and greater brain and 
intelligence. 

Young human adults do not appear to vary greatly in upright 
posture when standing or walking. Schultz (78, 80) finds less relative 
variability in face, head, trunk and limb measurements in man than in 
some lower primates. Some primates arc relatively more variable in 
arm length than man is in leg length. Relative absence of hair from the 
human body ajid limbs is a striking case of uniformity in a specific char- 
acter. This is especially true of the human back. We have already 
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noted that the human brain and CAVD intelligence are not strikingly 
variable. On the other hand, our internal organs are generic and some 
of them are very variable. Visual and olfactory acuity arc quite generic 
and are also quite variable. 

All members of the Order primates are alike in having one head, one 
trunk, four limbs, two ears, one tongue, etc., but if we are studying 
quantitative variations in characteristics — and that happens to be our 
present problem — we find some of the highest coefficients of variation 
for characteristics which are common not merely to the members of a 
genus but to the members of an entire family, order or class. The dis- 
tance between the nipples of human females has a V of about 20 
(92, p. 108). This is a mammalian characteristic and is much higher 
than the V’s for most linear measurements. Wide variations do occur 
in a single species, especially in domesticated animals which have been 
subjected to controlled breeding and selection, but even here the 
members of a breed of dogs or chickens tend to be alike rather than 
different. However, if less widespread traits are more variable than 
more generic traits it would seem that a single breed of dogs or chickens 
should show very wide variation in those cliaracteristics which are 
peculiar to the breed. 

Less Relevant vs. More Relevant 

Some writers have held that the variability of traits is inversely 
correlated with their biological relevance. This means that the less 
important the trait is for survival the more it may be expected to vary 
in the species. This seems to be based on the apparently reasonable 
deduction that the greatest variability cannot exist in things which are 
closely related to life. Against this is the fact that some of the internal 
organs are highly variable. Sensory capacities arc essential for environ- 
mental adjustments but some of them are very variable. In lower ani- 
mals many examples could be given of characteristics which seem to 
have little or no survival importance but which are rather uniform in a 
species. *'A survey of the characters which differentiate species (and 
to a less extent genera) reveals that in the vast majority of cases the 
specific characters have no known adaptive significance”* (Robson and 
Richards, 76, p. 314). Yet it .seems obvious that if these characters were 
not reasonably uniform in a species they could not be used to differ- 
entiate species. 

Less Used vs. Most Used 

According to Hollingworth (41, p, 78), “Infrequently used traits 
are more variable than are traits or activities more constantly em- 
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ployed.” As a psychological principle we might make something of a 
case for this ”law” in connection with the acquisition of skills in the 
individual. This problem will be discussed under the effects of practice. 

The Effects of Practice 

Thorndike formulated and defended the idea that acquired traits 
are more variable than native ones. This is tied up with a lively con- 
troversy over the question as to the effects of the environment, including 
education and practice, on individual differences. Given certain indi- 
vidual differences, how will these be affected by adding certain variables 
in the form of practice or training? 

If we consider the total "scores” arrived at by tossing n dice to 
represent ability before training and that the effects of practice could be 
represented by adding to these original scores some new scores secured* 
by tossing p additional dice, each child’s achievement after practice, 
could be represented by the total of dice. If this situation holds 
it follows from previous discussions that the normal effect of education 
and practice is to reduce relative variability. 

We have evidence, however, that our n and p variables in school 
work arc not independent variables but are negatively correlated. The 
dullest students in school are prodded most and work hardest. Fre- 
quently the brightest students arc permitted to loaf. Thus May (SS) 
in a study of college students found a correlation of —.35 between 
intelligence and the time spent in study. Drake (17) gave tests to 
college classes in biology and in history at the beginning and at the 
end of the semester, converted the raw scores to standard scores, and 
subtracted scores on the first test from scores on the second test to de- 
termine gains. He found negative correlations between first scores and 
gains and between intelligence and gains. Kelley (45) challenged the 
then current conception and held that formal education especially at 
the elementary school level reduced individual differences in several 
respects. Mcltzer and Bailor (56) tested psychology students at the 
beginning and end of a course and found men decreased from 40 to 21 
and women from 49 to 14 in relative variability in knowledge of the 
subject 

One method of studying the effects of practice and training is to 
compare some of the results secured from intelligence tests and achieve- 
ment tests. If these results are expressed in terms of age or grade 
norms and the range from the tenth to the ninetieth percentile points 
is detcrm^ied we find that variability in achievement is usually less 
than variability in intelligence. For example, at the age of ten years the 
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tenth to ninetieth percentile range on the revised Stanford-Binet test 
is 4.15 years (85, p. 40). On the Stanford Achievement Test, Cornell 
(11, p. 87) finds the corresponding range to be 3.4 years. This method 
largely avoids the difficulty imposed by the fact that the tests do not 
liavc true zeroes. The results do not justify the conclusion that or- 
dinary school training increases individual differences within the trained 
group. 

In controlled practice experiments gross gains are usually positively 
correlated with initial scores, but percentage improvement and initial 
scores are negatively correlated. Individuals with lower scores make 
relatively larger gains than those with larger scores. 

Kincaid (46) reviewed 24 experiments on the effects of practice and 
found negative correlations between initial scores and percentage im- 
provement in 22 of the studies. In 19 of the 24 experiments larger S.D.’s 
were found after practice than before, but in 16 of the 24 cases Vs were 
smaller than before. Both the S.D. and the M usually increase with 
practice but the latter increases more, with the result that relative 
variability usually decreases. 

Reed (73, 74), reviewing practice studies to 1931, concludes that 
in 77 per cent of the studies (^^ = 70) V decreases with practice, and in 
93 per cent of the studies {N = S&) the correlations between initial 
performance and per cent improvement are negative. 

Burns (8) reviews 84 practice studies and agrees essentially with 
Reed in his findings. He suggests that one of the important reasons for 
differences in the results of practice studies is to be found in differences 
in motivation. 

Additional practice studies have been made since 1937 but they have 
not changed the general picture nor have they made it clear why in 
some cases relative variability increases with practice. A recent study 
by Yoshioka and Jones (108) of stylus maze learning reports that with 
practice there is a sharp increase in relative variability in errors but 
not in time scores. In the latter case the work unit was constant and 
time scores decreased as learning progressed. In both cases then V’s 
have been computed for scores for which zero would be a perfect score 
(though impossible in case of time scores). Yet, as was noted earlier, 
we cannot legitimately use V scores unless the scale has approximately 
at least a true zero. 

A hypothetical example will clarify the difficulty. Children given a spelling 
test have an average of 80 right, 20 wrong, S.D. «5. After practice they test 
96 right, 4 wrong, S.D. ^2. Vs based on ‘‘rights” are 6 before practice and 2 
after practice. Based on “errors,” V*s are 25 before practice and 50 after prac- 
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tice. Clearly the true variability in knowledge of spelling is not represented 
by a V of 50 after practice. For this reason before V*s are computed all scores 
which show improvement by decreasing scores must be converted to a form in 
which increasing ability is represented by increasing scores. Otherwise the V 
scores may show the opposite of the true situation, though V*s based on time 
scores are usually more nearly correct than V*s based on error scores. 

Relatively few experiments have shown statistically significant in- 
creases in V's after practice. Further study of the exceptions is needed 
to explain why they give different results. If we took 100 individuals, 
divided them into four equal groups of equal initial ability and then 
gave distinctly different amounts of practice to each group, relative 
variability for the total group would normally be greater after practice 
then before. When environment operates differentially in this way, as it 
apparently docs at times, relative variability might be increased. Those 
with higher intelligence tend to remain in school longer so that amounts 
of practice are distinctly unequal. This should produce a wide spread 
in achievement along particular lines. But to measure this fairly we 
must have tests with true zeroes and with equal units. 

Forgetting 

To a consid.crable degree forgetting is a reversal of the process of 
learning. In terms of our dice illustration it is possibly similar to what 
would happen if we started with ''scores’' based on throwing n dice and 
subtracted from each score the scores resulting from throwing m dice, 
m being less than n. Apparently there is a low negative correlation 
between learning capacity and rate of forgetting. The mean score 
after forgetting would be lower than the original score and the standard 
deviation would be larger (109, p. 211). From this it follows that if the 
dice illustration holds, we should expect an increase in relative varia- 
bility after forgetting. 

Tilton (91) reviewed 39 studies of the effects of forgetting on indi- 
vidual differences and found that in 24 cases absolute S.D.’s increased. 
The ratio of the average S.D. after forgetting to the average S.D. before 
forgetting was 106.7 to 100. The average V after forgetting divided by 
the average V before forgetting gave 165.6. In the majority of the 
cases then experimental evidence has shown that forgetting does in- 
crease relative variability. Forgetting tends to return a group to the 
greater relative variability which existed before practice. 

Watson (97) reports that after forgetting relative variability is 
greater for recall than for recognition. Probably this is related to the 
fact that power of recall is lost more rapidly than power of recognition. 
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Effect of Fatigue 

Fatigue, like forgetting, involves a loss of function. It differs in 
that it is more immediately and acutely related to physiological func- 
tions such as circulation and respiration. Short and long work periods 
may differ because the short work period may not be greatly influenced 
by '‘second wind” while the long experiment is so influenced. 

Wells (100) had 10 subjects tap for 30 seconds and compared the 
first 5 seconds with the last 5. Under these conditions, with a short work 
period, a reduction in relative variability was found. 

With long work periods, increases in relative variability have usually 
been found. Weinland (99) had 10 subjects work with the ergograph 
over a period of six months and reported that V increased with fatigue. 
He concluded that increased variability was due largely to loss of con- 
trol. Manzer (54) had 27 college men work to exhaustion on the ergo- 
graph and found that relative variability of work with fatigued muscles 
was 309 per cent of what it was with unfatigued muscles. Philip (69) 
had twelve subjects work about seven hours at tapping until they were 
very fatigued. He found an increase in relative variability with 
fatigue and like Weinland attributed this largely to loss of control. 
Edwards (18) reports the effects of the loss of 100 hours of sleep on 19 
subjects. He finds no great differences in V’s for controls and experi- 
mental group but does find that the extremes in reaction times arc in 
the experimental group. Flugcl (24) had 46 children do addition for 20 
minutes daily for 46 days. He reported that fatigue and ability were 
found to be negatively correlated. This should give an increase in rela- 
tive variability. 

Some studies have reported qualitative changes such as doing the 
right thing at the wrong time. Marked increases in irritability are also 
reported. The unfatigued individual is evidently more stable and 
predictable. 

One of the technical difficulties in the study of fatigue is found in the 
fact that long fatigue tests are monotonous. Hence what is classified as 
fatigue is probably partly boredom and lack of effort. This factor 
would tend to produce vacation in the results of different experiments. 

Age Differences 

Vernon’s theory that variability decreases with growth has been 
mentioned. V’s based on data from Montague and Hollingworth (59) 
for length and weight of infants at birth are about 6 and 15 respectively. 
For young adults the corresponding figures are about 4 and 12, based 
on Army results for men (Table II) and on data from Doll (IS, after 
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Smedley) for women. Thus far then the principle seems to hold. 
However, there is a prepubertal acceleration in growth which results in 
a rise in V’s at that period. The greatest prepubertal or pubertal rela- 
tive variability in boys occurs at about 14 years and in girls at about 
12 years. After birth then, V*s for height and weight fall, then rise, then 
fall to maturity. In young adults relative variability is lower than at 
birth. 

Sensory and psycho-motor functions follow varying patterns with 
respect to relative variability during growth. No simple generalization 
seems to cover them. Relative variability in visual acuity rises from 
about 12 years to maturity. Relative variability in throwing a ball de- 
clines during adolescence in boys but seems to increase slightly in girls. 

Henmon and Livingstone (37) collected data from different sources 
and studied changes in relative variability of psycho-motor and mental* 
functions during growth. They found relative variability to decline 
with age to maturity, and did not find any consistent evidence that 
relative varicibility increases at the prepubertal period. 

Israeli (44) finds decreasing variability in aesthetic judgments with 
age to maturity. Miles (58, after Price) reports data on visual perccj)- 
tion, span and speed which show a decline in V’s with age to maturity. 

V’s based on group intelligence test results generally show declines 
with age to maturity. Freeman and Flory (25) report the results of a 
ten year study with the VACO Tests. This covers ages 8-17 inclusive. 
V’s decline regularly with age without evidence of a prepubertal bulge. 
Lincoln (50) reports V’s for the Yerkes- Bridges, the Presscy, and the 
Dearborn intelligence tests. All show declines with age. Odom (60) 
collected data for the Otis, the National and the Illinois intelligence 
tests. They also show declines in V’s with age. Adkins (1) finds that 
on retests with the Otis and with the Morgan tests V’s decline from 
grade 7 through grade 12. 

The fact that these tests do not have a true zero leaves some doubt 
as to the exact significance of the results reported. However, analysis 
of mental growth curves has convinced the WTiter that there is a real 
decrease in the relative variability of intelligence during adolescence. 

The Stanford-Binct is not a suitable instrument for measuring 
changes in relative variability with age. It is used on the assumption 
that the S.D. — and hence V — remains convStant at 16. Yet the original 
report by Terman and Merrill (85, p. 40) and later studies by Good- 
enough (30) and Brown (7) show marked declines in S.D.’s from 2.5 to 
6 years, t'^en a marked rise to 12 years, and a decline thereafter. This 
is a prepubertal bulge in relative variability, but whether it is due to 
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variations in growth or to the nature of the test is not clear. This is of 
practical importance because it means that a child who is 2.SS.D/s 
above average may flucutate 20 points in IQ simply by following the 
normal growth pattern for such children. 

The available evidence indicates that relative variability of intelli- 
gence declines during adolescence, but we cannot yet state with any 
certainty the exact changes that occur from birth to maturity. 

After maturity the general tendency is for relative variability to 
increase. Tests with an important speed factor show marked declines 
in the averages with age while those which emphasize power rather 
than speed are on the average less affected. However, Gilbert ( 28 ) 
reports losses in memory in senescence which show V's up to 87. This 
is for retention of Turkish-English vocabulary. For the age period 
from 20 to 29 years the corresponding V is 23. 

The results secured by Miles and Miles (57) which showed a decline 
in intelligence in old age were secured with a speed test. On this the 
S.D.’s change little from 30 to 70 years but the mean scores decline 
greatly and V's increase correspondingly. In another study already 
referred to above, Miles (58) submits data from Price showing M's and 
S.D.’s from age 6 to old age. For visual perception, span and speed 
at 6-7 years V = 32, at 20-24 years V = 13, and for 75-79 years V = 34. 
Goldfarb (29) found an increase in relative variability in reaction times 
in males in old age but the changes in females were not reliable. The 
marked increase in relative variability in visual acuity with age has 
already been referred to. V rises above 90 at the age of 60 in men (77, 
p. 85). The V for auditory acuit> also rises with age but to a lesser 
degree. 

There appears to be a fairly definite tendency for relative varia- 
bility to increase after early maturity, but there are great differences in 
the changes in different functions. An extreme increase is found in visual 
acuity while strength of grip shows only a slight increase. 

Variability and Change 

Darwin held that both rudimentary and highly developed struc- 
tures are unusually variable. Schultz (79, p. 321) makes a related 
suggestion: “Upon finding such a high variability in the human ear 
we are justified in suspecting that this structure is undergoing some 
evolutionary change, in the form of either an increase or a decrease ir 
size.” These conclusions, added to the finding of differences in relative 
variability associated with age changes, seem to justify this tentative 
conclusion : The relative variabilities of both structures and functions tend 
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to be positively correlated with rates of change^ both phylogenetic and on- 
togenetic, and both progressive and regressive. 

Schultz cites human wisdom teeth and little toes as examples of 
widely variable structures which seem to be in process of regression. 
We found sight and smell relatively more variable than hearing. Smell 
has been undergoing involution while sight has been evolving. 

This hypothesis agrees well with what is known about individual 
growth and decline. Quantitative individual differences are produced 
by differential growth rates — of which the IQ is one index. Relative 
variability is high in the foetus and in infancy when growth is most 
rapid. Growth rates and relative variabilities both decline with age 
until the prepubertal period. At this point physical growth is accci* 
erated and the corresponding V’s rise. From this point, both growth 
rates and V’s decline to maturity. Measurements of mental growth 
have not shown the prepubertal acceleration found for physical growth, 
and V*s for mental functions do not rise at this point. As mental func- 
tions begin to decline with advancing age the corresponding V's in- 
crease. Visual acuity declines very rapidly between 40 and 60 years 
and V rises above 90. 

High relative variabilities connected with change can in part at least 
be attributed to differences in the timing of changes in rates of growth 
and decline. Richey (75, p. 67), speaking of physical growth during the 
prepubertal and adolescent periods, comments: 

In general, it may be stated that measures of variability increase during the 
period that growth is comparatively rapid. A large part of the variability found 
for any particular group is probably due to differences in the periods of ac- 
celeration and retardation of the growth rates of the individuals making up 
the group. 

The prepubertal spurt in physical growth occurs earlier in some 
children than in others, even among those who will as adults be the 
same height. And decline in old age occurs earlier in some than in 
others. There are also differences in the age at which growth stops. 
Altogether these differences in rates of growth and decline and in the 
timing of changes of rate are responsible for a large part of variability. 

Sex Differences 

Scientific males have fairly commonly credited males with greater 
variability and have used this to explain the larger number of male 
geniuses. 

Pearson (66) assembled a considerable mass of statistics dealing with 
sex differences in physical measurements and concluded that human fe- 
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males are more variable than males. Pearl (65) collected data for about 
40 comparisons of the sexes on physical measurements and his results 
show that V*s for females are higher in a ratio of about 3 to 2. Todd and 
Lindala (92) made about 60 measurements of both white and Negro 
males and females, a total of about 120 comparisons. V’s for females 
are larger in a ratio of about 3 to 1. From these results it seems neces- 
sary to conclude that adult human females are relatively more variable 
tlian adult males in the majority of physical measurements. However, 
the average difference is usually small. 

It is generally accepted that females mature physically earlier than 
males. The average difference in age at the time of arriving at maturity 
is probably two or three years. And in discussing age differences we 
concluded that, with the exception of the prepubertal period, relative 
variability in height and weight at least decline with age to maturity. 
It follows that if we compare the heights and weights of boys and girls 
of the same age the girls are likely in the majority of comparisons to be 
relatively less variable simply because they are more mature. Their 
lower relative variability is a mark of greater maturity and not of a 
sex difference in relative variability. This possibility is recognized by 
Lincoln (SO, p. 164) who says that variability may be a function of 
maturity. Since most of our statistics have been of school children we 
seem thus to have succeeded in reversing the true picture and have 
erroneously concluded that males in general are relatively more vari- 
able than females. If it were not for sex differences in the ages at which 
prepubertal and pubertal accelerations in growth take place, it is prob- 
able that males would more consistently be relatively more variable 
than females during the growth period. 

On intelligence tests at the adult level we do not know taccurately 
how the sexes compare either as to the average level of ability or as 
to variability. Prevailing opinion is that the average level is about the 
same, but that there are average differences in ability to score on spe- 
cialized tests. 

That there is a sex difference in intelligence at ages 11,12 and 13 is 
shown by Table V. Score^on different tests have been made compara- 
ble by this method: the test score for 11-year-old girls is changed to 
11 and the test score for 13-year-old girls is changed to 13. The other 
scores are then converted to this scale by proportion. 

Terman (84, p. 66) docs not report the numbers of cases for each 
age and sex but he gives a total of 905 cases for ages 5 to 14 inclusive. 
This is an average of 45.25 cases for each age and sex group. Hence I 
have used 45 in the table. The numbers of cases given for Pyle's test 
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TABLE V 

Sex Differences In Intelligence At 11, 12 And 13 Years As Determined 
By Nine Intelligence Tests 


Test 


Age 11 

Age 12 

Age 13 

N 

Mean 

N 

Mean 

N 

Mean 

National UOl) 

B 

613 

10.08 






G 

643 

11.00 





Illinois (101) 

B 

155 

10.51 

171 

11.27 

166 

12.13 


G 

175 

11.00 

184 

12.19 

157 

13.00 

Stanford-Binet, 

B 

45? 

10.27 

45? 

11.71 

45? 

12.83 . 

1916 (84) 

G 

45? 

11.00 

45? 

12.03 

45? 

13.00 

National (53) 

B 

98 

10.33 

89 

11.69 

89 

12.16 


G 

88 

11.00 

98 

11.96 

67 

13.00 

Pyle (71) 

B 

73 

9.71 

80 

10.42 

82 

11.44 


G 

73 

11.00 

89 

12.07 

73 

13.00 

McCall (9) 

B 

94 

10.80 

102 

11.18 

97 

12.10 


G 

98 

11.00 

101 

11.82 

84 

13.00 

Pressey (70) 

B 

179 

10.41 

182 

11.26 

174 

12.48 


G 

167 

11.00 

180 

11.76 

174 

13.00 

Army Alpha (10) 

B 

22 

10.44 

34 

11.55 

35 

11 .53 


G 

33 

11.00 

31 

12.12 

35 

13.00 

VACO (25) 

B 

150 

10.83 

163 

12.01 

149 

12.83 


G 

176 

11.00 

168 

12.03 

144 

13.00 

Yerkcs et aL (107) 

B 

33 

10.91 

34 

12.11 

28 

12.75 


G 

29 

11 00 

32 

12.68 

26 

13.00 

Weighted Means 

B 

1453 

10.38 

900 

11.42 

865 

12.29 


G 

1527 

11.00 

928 

12.01 

805 

13.00 

Avljustcd 

B 



3218 

11.367 



Weighted M's 

G 



3260 

12.000 



Difference '.iid standard error 



.633 ±.0464 
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arc averages based on his tables for seven diflFerent tests. The score 
for boys on the National Test reported by Whipple is arrived at by 
counting 15 points as 1 year — this from the test norms. On the Yerkes 
Point Scale, all cases from 11 to 12 are grouped at 11, and so on. The 
final difference score, called the adjusted weighted mean difference is 
found by adding 1 year to the 11 year scores and by subtracting 1 year 
from the 13 year scores. This groups all cases at 12 years. Distribution 
S.D.’s are assumed to be 16 IQ units in standard error calculations. 
Hence for 12 years the S.D. used is 1.92. 

This method gives the boys a mean score of 11.367 and the girls a 
mean score of 12. The difference, .633, is equal to 7.6 months. Taking 
three standard errors below and above the difference gives us .494 and 
,772. These are equivalent roughly to six months and nine months. 
Hence we conclude that for the above battery of tests there is a sex 
difference in mental maturity of from six to nine months at the age of 
twelve years. 

Something should be said about results not included in the table. 
Whipple reported National test results for four cities for children aged 
11 years. In two of these cities a special effort was made to t(‘st every 
child of this age. This apporcntly was not done in the other two cities. 
I have included the two cities where all cases were t(‘stcd and have ex- 
cluded the others. Wechsler- Bellevue and Revised Stanford-Binet 
results are not included because an effort w^as made to eliminate sex 
differences when the tests were standardized. Dearborn Test results 
are excluded becaUvSc Lincoln (SO, pp. 166 ff.) has shown that the test 
content unduly favors boys. As a result, it, like the two tests just men- 
tioned, gives about equal scores for the sexes. 

A battery consisting of these three tests would show the sexes about 
equal So far as I know, no similar battery of general intelligence tests 
now in use in this country will show boys mentally more mature than 
girls at these ages. At the high school level, boys are likely to test higher 
than girls because the boys are more drastically selected. 

If girls te&t higher than boys at age 12, and if the sexes arrive at the 
same final average level, this shows that the girls arc maturing earlier. 
Since we found that relative variability decreases with age to maturity 
it seems that we confront about the same situation wdth respect to 
intelligence as that found for physical traits. When age comparisons 
show girls with lower V’s, the correct interpretation seems to be that 
girls arc more mature, and. being more mature, they are consequently 
relatively less variable. When the boys arrive at the same maturity 
level, the V’s for them will have decreased. To ignore the maturity 
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TABLE VI 


Relative Variabilities Of The Sexes On Tests In Which One Sex 
Is Definitely Superior 


Test 

N 


M 

S.D. 

V 

Artificial Language (19) 

3236 

M 

22.8 

11.0 

48 


2632 

F 

29.5 

12.9 

44 

Arithmetic (19) 

3236 

M 

29.6 

12.5 

42 


2632 

F 

21.9 

11.2 

51 

English (19) 

3453 

M 

49.9 

18.1 

36 


2880 

F 

55.3 

18.4 

33 

Mathematics (19) 

3453 

M 

34.7 

14.9 

43 


2880 

F 

25.2 

12.3 

49 

Geometrical Constr. No. 8 (93) 

540 

M 

10.7 

3.4 

32 


703 

F 

8.9 

4.3 

48 

Tonal Intensity (53) 

100 

M 

81.8 

8.3 

10 


100 

F 

77.3 

11.3 

15 

Tonal Memory (53) 

100 

M 

75.5 

11.8 

16 


100 

F 

59.7 

16.1 

27 

Beta 2, Cube Analysis (103) 

1161 

M 

13.12 

4.80 

37 


1160 

F 

10.06 

4.08 

41 

Beta 5, No. Chccldng (103) 

1161 

M 

24.42 

8.15 

33 


1160 

F 

27.72 

8.70 

31 

Conservatism-Radicalism (82) 

181 

M 

23.8 

9.0 

38 


206 

F 

26.9 

8.4 

31 

Unpleasantness (Sense of) (82) 

187 

M 

31.6 

16.7 

53 


202 

F 

34.6 

15.8 

46 

Throwing a Ball (15 yrs.) (23) 

117 

M 

134.8 

24.18 

l8 


138 

F 

73.6 

18.5 

25 

Grip Strength (22.5 yrs.) (77) 


M 

82.16 

11.86 

14 

(20) 


F 

51.00 

10.20 

20 

Mechanical Comprehension (2) 

54 

M 

43.78 

7.70 

18 


53 

F 

27.57 

8.83 

32 
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variable and to draw conclusions about adult sex differences from age 
comparisons of boys and girls leads apparently to error, and possibly 
to the reverse of the truth. 

We have found that relative variability usually decreases with de- 
velopment. On this basis, if females are better than males in Trait A 
and are poorer in Trait B, they should, other things being equal, be 
relatively less variable in A and more variable in B. The largest sex 
differences on the American Council Psychological Examination are 
found for artificial language and arithmetic. On the Iowa High School 
Content Test the differences arc greatest for English and mathematics. 
Table VI supplies the California Junior College norms for these tests 
as reported by Eells (19). To these results I have added geometrical 
construction (Touton, 93), tonal intensity and tonal memory (Me-, 
Ncmar and Terman, after Church, 53), Army Beta 2 and 5 (Winsor 
103), conservatism and sense of unpleasantness (Skaggs, 82), throwing 
a ball (Espcnschade, 23), strength of grip (Ruger and Stocssiger, 77; 
Elderton and Moul, 20), and mechanical comprehension (Bennett and 
Cruikshank, 2). V’s have been calculated or checked by the writer 
and are given to the nearest whole number. 

The majority of these V*s are too high because of the lack of true 
zeroes. V’s for throwing a ball are too high because, as Hill (40) points 
out, the distance travelled by a projectile varies as the square of the 
initial velocity. Hence theoretically we should take the square root of 
the original measurements before computing the value of V. However, 
these defects affect both sexes in the same way, so that our comparison 
of the sexes is valid even though the V’s are too high. 

An examination of the table shows that the hypothesis holds in all 
cases. However, no claim is made that this is always true. It should 
be said that this table is made up of characteristics which show the larg- 
est sex differences in averages. If a random selection of traits were 
studied, most of the differences would be small, and as a result of errors 
of sampling and of measurement a much less consistent result should 
be expected. In any case, the hypothesis seems to work often enough 
to justify the suggestion .that adult women will more commonly be 
relatively less variable in those things in which they arc better; while 
men will be relatively less variable in those things in which they are 
better. 

Absolute variabilities are less consistent. In six out of fourteen cases 
the higher mean goes with the lower S.D.; in the rcnicaining cases the 
higher mean goes with the higher S.D. 

As related to the question of genius, ^absolute variabilities seem 
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more important than relative variabilities, because, as was stated in 
the preceding sentence, in eight cases out of fourteen, the higher mean 
is associated with the higher S.D. However, Table VI is in most cases 
not based on the performances of mature adults, and measurements for 
making a fair comparison of adults are not available. 

Race Differences 

The V for opinions on this subject is very large, but the problem 
must remain unsettled until we have more measurements. 

Differences within the Individual 

Trait Variability, We have mentioned HulKs estimate (42) that on 
the average the different traits of a single individual vary 80 per cent 
as much as a single trait varies in a group of individuals. Instead of 
trying to answer the question in that form we can give a more certain 
answer if the question is put more specifically. 

The amount of trait variability in the individual depends on the 
intercorrelations between the traits in question. Ghisclli (27) has sup- 
plied a formula for computing the extent of trait variability from the 
average intercorrclation between traits. Most investigators have found 
that the intercorrelations between simple motor functions are near 
zero. This means that trait variability of motor functions is about 100 
per cent <5f the variability of a single trait in a group. This is verified in 
a study made by Owens (61). As the intercorrelations between the 
traits studied increase, trait variability within the individual decreases. 
Presumably the more nearly we are able to measure pure and inde- 
pendent traits the lower the intercorrelations will be and the greater 
the trait variability. 

Measurements of complex functions, achievement tests included, 
usually involve considerable overlapping with the result that individual 
variability is correspondingly reduced. Gray (31) reports that the 
average range of individual variation on six achievement tests is two 
S.D.’s. This means that individual variation on such a limited battery 
of tests is about 35 per cent of group variation. 

From the foregoing it appears that the amount of trait variability 
found in any particular study will depend largely on the battery of 
tests used. Any general average for all psychological traits would 
under present conditions be somewhat arbitrary and tentative. 

Owens (62) studied the effects of practice on a group of motor traits 
and found that they did not become more alike with practice. This 
seemed to indicate that motor trait differences were due to innate 


causes. 
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The question has been raised as to whether there are differences in 
trait variability at different levels of ability. Among the studies of 
this problem are those by De Voss (13), Hertzman (39), Down (5), and 
Gray (31, 32). Their results are conflicting and inconclusive. However, 
Garrett (26) has shown that the correlations between such functions as 
verbal, numerical and spatial abilities decrease with age to maturity. 
From this it would follow that the amount of trait variability would be 
positively correlated with the level of ability. The failure of other 
workers to find clear evidence of this trend may be attributed to cither 
or both of two factors: the abilities tested were too complex or the age 
range of subjects was too limited. 

Quotidian Variability. Woodrow (104) uses the term '‘quotidian 
variability” to cover variations in level of performance from day to 
day. Owens (61) uses the term ‘‘repetitive variations” and includes 
under this the systematic changes due to learning. In a study of motor 
skills Owens finds repetitive variations to be about 13 per cent as great 
as individual differences, and he attributes 90 per cent of the repetitive 
variations to learning. This makes quotidian variability unimportant 
in his study. 

Elliott (21) finds that strong motivation decreases variability of 
performance on practice tests in rats. From this we can infer that a 
part at least of quotidian variability is due to differences in concentra- 
tion and effort. This, of course, is what we should expect. 

Variations in the results of achievement, intelligence, personality 
and other tests show that individuals do vary considerably from day 
to day. The unreliability of questionnaire tests is well known. So is the 
unreliability of teachers’ marks. All of these depend largely on quo- 
tidian variability. 

Paulsen (63) points out that coefficients of reliability obtained by 
the split-half technique are higher than those obtained by the test- 
retest method. This difference is due to fluctuations in the strength 
or efficiency of the traits themselves. He proposes to measure this by 
correcting the test-retest reliability for attenuation. This will indicate 
tile highest possible test-retest reliability The Spearman-Brown cor- 
rection formula does not apply to test-retest reliabilities, and in a later 
study of steadiness (64), Paulsen finds that the highest possible test- 
retest reliability is about .80. This means that the factors contributing 
to quotidian variability in steadiness are much more important than 
Owens found in his study of motor abilities. 

A related study of trait consistency on the behavior side has been 
made by Trawick (94). He finds consistency in performance to be an 
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important indicator of the integration of personality. The more con- 
sistent individuals are generally more self-confident, ascendant, have 
higher self-esteem, are more goal-seeking, more sociable, more predict- 
able, objectively more modest, and have more social insight. From 
this point of view, the study of quotidian variability involves us in the 
problems of unstable personalities. 

This should be a profitable field for future study and particularly do 
we need to learn more about the limits of test-retest reliabilities. 

Discussion 

If we arbitrarily define a superior person as one who scores three 
sigmas above average, the superior person is about twice as good as the 
average in relatively simple capacities such as visual and olfactory 
acuity. As we descend the V scale the superior person’s ratio of supe- 
riority decreases. On tapping speed the superior person is about forty 
per cent better than average. On stature he is only about twelve per 
cent above average. On level of CAVD intelligence he is only about 
eight per cent above average. Also if total ability is considered to be 
the sum of different more specialized abilities, the superior individual 
will deviate less from the average in total ability than he does in some 
specialized abilities. On this basis probably we should not be surprised 
when we find that a successful politician shows more superiority in the 
particular characteristics required for winning votes than he shows later 
in the way of general administrative ability. 

General Conclusions 

Much of the confusion found in conflicting statements about 
relative variability is due to the use of different methods, to failure to 
discriminate between absolute and relative measures of variability 
and to the use of the V formula with scores that are far from having 
true zeroes. 

Too many of the *'laws” and broad generalizations about relative 
variability proposed by different writers will not stand critical examina- 
tion. More complex, higher and more recently developed functions 
tend to be relatively less rather than relatively more variable. As 
judged by relative variabilities, complex mental functions seem to be 
the sums rather than the products of simpler functions. 

Practice usually reduces relative variability. F'atigue and forgetting 
usually increase relative variability. Changes in relative variability 
w’th ai^^ differ greatly according to the trait under consideration but 
with a definite tendency for relative variability to decrease with age 
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until maturity and to increase thereafter. Sex differences in relative 
variability are small, but greater relative variability is more often 
found in boys. This probably means that girls of a given age are more 
mature than boys of the same age and not that females arc generally 
relatively less variable than males. Adult females are relatively more 
variable physically than males. In general, low relative variabilities 
indicate superiority. Each sex tends to be relatively less variable in 
those traits in which it is superior. 

Relative variabilities are determined largely by rates of growth and 
decline and by differences in the timing of changes in rate. 

The per cent which intra-individual trait variability is of individual 
differences ranges from 100 downwards, depending on the traits tested. 
The study of quotidian variability is largely in the exploratory stage. 
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A DISCUSSION OF SOME CAUSES OF OPERATIONAL 
FATIGUE IN THE ARMY AIR FORCES* 

LESSING A. KAHN 
University of Pennsylvania 

Definitions 

According to the Flight Surgeon's Handbook, operational fatigue is 
defined as a predominately emotional condition found in air crew 
personnel as a result of abnormal strains being placed on normal in- 
dividuals. 

. . . flying stress or operational fatigue is used to describe a condition that 
may be oserved as an abnormal flying strain being placed on a normal indi- 
vidual (10). 

Its prevalence was marked in members of crews engaged in combat 
flying. The dangerous and extreme nerve-racking character of combat 
flying acted as overwhelming stresses upon many so engaged. 

The .supposition that “flying fatigue” is in any strict sense to be 
construed as being equivalent to the more general and more subtle 
syndrome of operational fatigue is, in my opinion, without basis; 
notwithstanding that such fatigue may enter as a factor contributing 
to the generation of operational fatigue. Spiegel notes (12) that the 
term “flying fatigue” as used to describe a clinical state may be mani- 
fested by any one of the following symptoms: anxiety, depression, 
anorexia, dreams, agitation, tremors, loss of confidence, regressiveness, 
and many others that are psychological in origin; fatigue of this sort is 
not a reaction to flying per se, rather to the conditions under which 
flying takes place. Hastings (5) makes a most accurate and clear 
differentiation between these two types of fatigue: 

Flying fatigue . . . means ordinary fatigue and the physical and mental 
symptoms of it and does not imply that the individual is emotionally sick. 
Flying fatigue is the same as the fatigue any individual would suffer if he had 

* The author wishes to express his appreciation to Prof. Samuel W. Fernberger for 
reading this paper and offering many helpful suggestions. 

{Editor's Note: Mr. Kahn served in the Army Air Force as a combat navigator with 
the rank of 1st Lieutenant. He was stationed in Italy with the 450th Heavy Bombard- 
ment Group, 15th Army Air Force, from July until September 1944, when he was shot 
down in combat over Vienna on September 12th. He was then interned as a prisoner ci 
war in Germany where he had the opportunity to collect case history materials con- 
cerning the acute emotional conditions considered in his review.) 
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insufficient sleep, rest, and relaxation, and had been exposed to the nervous 
strain of flying. 

Operational fatigue ... is used to describe a typical syndrome of breakdown 
occurring in essentially stable individuals, who by continued stress, harrowing 
experiences, and physical fatigue develop an illness which is roughly half fatigue 
and half emotional illness. 

Symptoms 

The most common symptoms of operational fatigue lie within the 
broader context of anxiety states ( 4 ) — in which the individual is still 
attempting to deal with a situation tiiat has long since been eliminated 
from the present sphere of environment; in short, the individual is 
attempting to deal with reactions to combat. Moreover, according to 
Hastings as regards to the direction in which aggression can spread: 

. . . schematically one can hit back at that which threatens one, one can hit 
at other people, at inanimate things or at oneself . . . [projection, dissociation, 
introjcction, etc.] 

In a study made of combat fatigue, etc., 29% of successful combat men state 
with feeling that they have a personal hatred against the enemy, that they want 
to kill the men in the enemy fighters — at least when they are in the air. More 
than 70% state that they develop irritability and quick flaming anger at their 
crew mates, in a way entirely foreign to their usual feelings and actions, as their 
operational tour progresses. Such feelings of animosity develop most charac- 
teristically during periods of inactivity, when there is little active outlet for 
ilieir feelings. The monthly dances held at groups bases have been marked by 
a number of violent fights. For example, on one occasion two squadron com- 
nuinders, quite close friends, riding back to the post after an evening of moderate 
drinking, amicably and deliberately decided they “needed a fight,” and without 
any quarrel of any sort, got out of the car and fought violently until one broke 
a metacarpel bone, after which they amicably climbed back into the car and 
drove home. 

In order to relieve tension, it was not uncommon for men to shoot out the 
lights with a tommy gun or to shoot one’s initials into a wall with a pistol. The 
feeling was that of just blowing off or smashing something. 

Several men have reported, upon close examination, that they have seduced 
women, not for any sexual gratification, but for the sake of subduing and con- 
(luering their defenses. 

Enlisted men’s barracks are the scenes of consistent and brutal attacks upon 
the equanimity of new crews coming into them. The old combat man describes 
to the new men the appearance of a man’s brains scattered about a plane by the 
iiciion of a 20 mm shell, and the like (5, pp. 137-139). 

The persistence of anxiety, brought about by flying incidents in 
which personal security and safety has been menaced, is characterized 

a breakdown in adaptive mechanisms. This is manifested by the 
individuaFs failure to relax and be free of tension when he is no longer 
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confronted with the combat situation. Grinker and Spiegel (4) give 
us a dear picture of the clinical syndrome of anxiety: 

Severe anxiety states result in an intensely striking, unforgettable picture. 
Terror-stricken, mute, and tremulous, the patient closely resembles those suf 
faring from an acute psychosis. The facial expression may be vacuous or fearful 
and apprehensive. There are course tremors of the extremities. Speech is 
usually impossible except for a few stuttering attempts to frame an occasional 
word. Sudden fits of crying or laughing may occur without reason. Behavioi 
is extremely bizarre, and attitudinizing with apparently senseless gestures, al- 
ternates with periods of excessive activity, characterized by running about the 
ward and leaping over the beds. Terror is one of the principle themes of the 
patient’s behavior. He resembles a frightened inarticulate child, with only a 
few persistent “islands” of his past well-organized behavior. 

Mild anxiety states in contrast to the severe anxiety states present uniform 
and sometimes monotonously similar clinical pictures. Upon going into battle, 
anxiety appears in gradually increasing amounts. At first it is kept under con- 
trol through an effort of will. Under the continued stress of battle, of near 
escapes from death, of constant exposure to anti-aircraft fire, the anxiety over- 
whelms him. He develops gross persistent tremor and feels weak, as if his legs 
would carry him no further. He becomes dizzy so that intelligent thought is 
impossible. 

The picture of the mental state of these men is typical and strikingly 
uniform as to symptomatology (3, 4, 5, 6, 9, 12). On missions the 
individual is usually tense, jittery, restless, and inefficient. In tin* 
presence of flak or enemy fighter opposition, he becomes overly appre- 
hensive and fearful of the entire combat presentation. He complains 
that the airplane will be shot down, that he will have to bail out, that 
the plane will be shot up to such an extent that escape for all will be 
impossible, or that he will be hit by flak or other missiles. In many 
cases, free-floating anxiety is experienced along with the usual physio- 
logical concomitants of anxiety, such as, dizziness, nausea, feelings of 
weakness, high blood pressure, rapid pulse, extreme perspiration, and 
the like (4). More severe symptoms may become evident such as head- 
aches, vomiting, fainting spells, etc. Needless to say, the appearance 
of these symptoms of fatigue during the course of a mission not only 
endangers the entire organization, but summarily affects the over-all 
efficiency of the individual concerned. In the more severe cases, a state 
of fear and terror develops to the point wffiere bailing out becomes the 
only possible solution. Levy (7) cites several interesting examples of 
patients wdio lay upon the floor of the plane to prevent themselves from 
giving in to the impulse of bailing out. In between missions, sleep is 
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disturbed, dreams and nightmares centered about combat experiences, 
increased tension and irritability, etc., are the rule (4, 5). 

The appetite is impaired and weight is usually lost (3). Excessive 
drinking and smoking as an attempt to relive anxiety is common; and 
the need for finding sexual outlets is intensified by the factors of fatigue 
and anxiety of these individuals (4, S). A frequent complaint (12) and 
at the same time a symptom is the loss of confidence in the ability to 
perform the job with responsibility, and a conflict occurs with regard 
to ultimate consequences to the remaining crew members. This brood- 
ing results in further conflict and anxiety, which has its most telling 
effect on an already weakened ego with ever increasing feelings of 
insecurity and loss of balance. 

The following case cited by Grinker and Spiegel illustrates for us the 
developmental aspects of the fatigue syndrome: 

A twenty-four year old airman had demonstrated fine skill as a pilot and was 
much respected for his aptitude and judgment. During a night mission, the 
bomber plane in which he was flying co-pilot became lost. The pilot cruised for 
hours looking for home base, until finally the gas ran out and he instructed the 
crew to parachute to safety. However, he had misjudged the altitude. Four 
other members of the crew got out safely, but the patient had consideralilc 
difficulty getting out of the pilot’s compartment to the escape door. There was 
much delay and loss of precious time while he struggled to get through; finally 
by a super-human effort he made it and jumped, leaving the pilot still in the 
plane. Because of the low altitude, his parachute had just opened when he 
landed on rough ground, injuring his back. A few minutes later he heard an ex- 
plosion accompanied by a tremendous flash of flame; obviously the pilot had 
crashed some distance away. Although he was concerned for the safety of the 
pilot, nothing could be done that night because of the complete blackness. 

The following morning he rounded up the other members of the crew who 
were safe in the area, and set out to search for the plane and pilot. They 
found the pilot’s body, a charred and smashed mass, among the twisted 
wreckage of thf* plane. Clearly, the pilot had not had time to escape before the 
I)lane crashed. Eventually the men were picked up by friendly natives and 
made their way back to base where the patient was hospitalized because of the 
back injury, which consisted of a fractured transverse vertebral process. In the 
hospital he at first exhibited marked continuous anxiety, associated with tremor 
and terrifying dreams, in which he saw himself falling from a plane and crashing 
on the hard ground. Considera*ble depression and grief for the dead pilot, who 
held been a good friend of his, was manifested. The grief was accompanied by 
some conscious guilt because of the patient’s delay in escay)ing from the i)lane. 
This had not actually been his fault; but had it not occurred, the pilot could 
have jumped in time to save his life. In time the anxiety and depression dis- 
appeared to be replaced by a pronounced conversion symptom related to the 
back injury. The lesion had healed; but the patient continued to complain of 
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the severe pain. At this point hospitalization was discontinued and the patient 
returned to his unit in the hope that he would give up the conversion symptom, 
when back in the environment of his friends and active operational duties. He 
was given ground duties as an operations officer, and after a short time the 
conversion symptom disappeared, to be again replaced by a slight anxiety and 
depression. The patient felt very badly because his friends and class-mates at 
flying school were participating in missions while he remained on the ground. 
Accordingly, in spite of the mild anxiety, he asked to be restored to flying 
status. 

After a short period of training, during which he did very well, he was as- 
signed a crew and thereafter participated in seven bombing missions over Tunis 
and Bizertc, each time going through heavy flak, which severely damaged his 
plane. He had no increase of anxiety in relation to these missions, and felt con- 
fidence in himself. On the eighth mission, however, his tent mate and best friend 
was shot down in flames over the target. The patient brought his own plane 
back uneventfully, but after his return he had an intense recurrence of anxiety 
with tremor. That night he had the first recurrence of the anxiety dream of 
falling. The next day he continued to have anxiety, but determined to fight it; 
he said nothing and went on a mission. However, he had so much anxiety that 
the experience was like a nightmare and he could scarcely keep his mind on the 
job. In addition, he was harassed by a specific phobic apprehension that the 
plane was falling off to the right — the direction in which he had seen his friend’s 
plane fall. This phobia persisted, in spite of the evidence of his senses and the 
instruments that the plane was in level flight. So much did this obsess the 
patient that he almost crashed the plane on landing, because of miscalculation, 
greatly upsetting the crew, who reported the situation to the Flight Surgeon, 
The patient confessed the recurrence of anxiety to the Flight Surgeon, who 
referred him for psychiatric evaluation, and it was at this point that we first 
saw the patient. 

In the hospital he exhibited moderate anxiety, tremor, and dreams of falling, 
with a marked phobic response to every aspect of flying. He could not think 
of planes without strong anxiety. Because he had enjoyed flying, this reaction 
gave him considerable depression ... (4, pp, 122-126). 

In sum, the crux of the problem of adjustment to the combat situa- 
tion is the degree of success which the individual is cible to maintain in 
order to cope with the manifold factors effective in the combat situation, 
namely, threats to the individual’s personal safety and security. The 
initial phases of the fatigue syndrome are constituted of mild anxiety 
states, in which the individual is attempting to reconcile his continua- 
tion in combat and maintaining hio owm personality intact. When such 
attempts are uniformly unsuccessful and prolonged, the resulting anxi- 
ety states become more intricate, and implications therefrom become 
more inclusive symptomatically speaking. On the other hand, severe 
apprehension of the combat situation is the most likely extreme to be 
manifested by those individuals relatively free from anxiety and other 
fatigue symptoms. 



OPERATIONAL FATIGUE IN THE ARMY AIR FORCES 


39 


Causes 


Armstrong (1) lists the occupational causes of fatigue as follows: 


I. Physical Agents 

a. Heat 

b. Cold 

c. Improper clothing 

d. Vibration 

e. Glare 

f. Noise 

g. Wind 

h. Acceleration 

i. Barometric changes 
III. Deficiencies 

a. Oxygen (anoxemia) 


II. Emotional Stresses 

a. Physical discomfort 

b. Responsibility 

c. Attention 

d. Concentration 
c. Alertness 

f. Apprehension 

g. Anxiety 

h. Fear 


IV. Toxic Agents 

a. Carbon Monoxide poison 


Not only docs combat flying involve the many factors given above, 
but more comprehensive ones such as geographic and climatic condi- 
tions, enemy installations, and opposition in active combat. Among 
all of the factors contribiitivc to operational fatigue, my experience 
has shown the emotional stresses to be most potent. Such oth(;r 
causative factors as deficiencies, toxic agents, geographic, and climatic 
conditions, etc., I have found to be only incidental to the really dispos- 
ing factor of emotional stress. 


Physical Agents 

In a consideration of combat stresses which dispose crew members to 
the sundry forms of operational fatigue, physical factors which operate 
during a flight play a decisive role. Missions arc flown at high altitudes 
and in close formation to afford a maximum amount of protection from 
enemy opposition. Oxygen masks, electrically heated suits, steel 
helmets, and specially built armor suits (flak) arc worn on all missions. 
Pilot and crew members must suffer these physical discomforts for long 
periods of time and the effects of these precautionary devices are telling. 
Constant opposition from enemy fighter aircraft, the noises of bursting 
machine guns, canon, shell hits, flak bursts and hits, the humdrum of 
engines being throttled back and forth, the rush of the wind, the noise 
of radio and interphone equipment, and a host of other physical events 
all contribute to the general emotional and physical tenseness of the 
situation on hand. The following report by Hastings (5) vividly por- 
trays some of these conditions present on a combat mission: 

The^airplane, a B-17, on a mission over a distant target in enemy-occupied 
Europe, had most of its controls shot out by attacking planes before reaching 
the target, so that the ship was knocked out of formation. The pilot, however, 
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with the exercise of great skill and strength, persisted in making an effective 
bomb run. Following this, the lone airplane was attacked by about 100 FW-190 
fighters over a period of perhaps three-quarters of an hour, during which time 
extraordinary damage was done to the plane and crew. Virtually all of the 
crew were wounded, three severely, and one became anoxic as a result of the 
simultaneous explosion of a 20 mm. cannon shell next to him, and the severance 
of his oxygen system. Almost all of the control cables were cut in various 
places, the oxygen, hydraulic, and electrical systems were knocked out, the inter- 
phone and radio systems destroyed, a small fire started in the bomb-bay, large 
holes were put through both wings, holes were in both propellers, bomb-hay, 
fuselage and nose; the tail assembly received so many direct cannon hits that 
it vibrated violently, and, after inspection by the flight engineer, was expected 
to tear off entirely at any moment. 

We must be cautious, however, not to localize any of these stresses 
and claim them to be the only causative factors operating to produce 
fatigue. Thus, in discussing physical agents as participating in the 
production of operational fatigue, we shall also note that each and every 
one listed is highly charged with emotionally disturbing aspects. 

During the first five missions most crews would have encountered the har- 
rowing experiences which were the normal events for heavy bombers operating 
from this theatre [Eighth Air Force stationed in England]. Watching close in 
and constant enemy fighter attacks, flying through seemingly inpcnetratablc 
walls of flak, seeing neighboring planes go down out of control, and at times 
explode in mid-air, returning with dead . . . (5). 

High altitude flying in itself constituted one of the greatest hazards 
of combat flying — the most important aspect of which w'as the constant 
need for oxygen in proper amounts in order to function at maximum 
efficiency. The intense cold and the danger of frost bite were constant 
reminders to all crew members of the latent dangers at hand. 

My first two or three missions bring to mind the extreme uncomfort of high 
altitude flying. Equipped with inadequate heated suits, I w^ould lie huddled in 
the forward end of the nose of the ship, where I could at least observe the 
instrument panel and record with frozen fingers the progress of the pk.ne. I 
frankly confess that I was completely oblivious of all that was going around 
me, that is, fighter and flak opposition. Under these conditions, 1 managed to 
perform my duties, although at a minimum of proficiency. 

Glare ployed an important part to offend crew members, especially 
jillots and gunners who had to be at all times at top operational ef- 
ficiency regardless of such disturbing factors. The great feeling of mu- 
tual dependence which crew members held for one another often 
caused them to endure the most trying of hardships, such as flying into 
the sun "'.nd holding such a formation position until the tactical situa- 
tion changed. Gunners constantly had to scan the skies and be on the 
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alert for oncoming fighters — enemy fighters that purposely held the 
sun behind them in attack so as to hamper our own gunners. 

Whatever the rationale operative in these physical conditioning 
factors, one point is outstanding, namely, while they may not in them- 
selves be sufficient to bring about a partial or complete personality 
breakdown, yet their effect was quite apparent. 

Geographic and Climatic Conditions 

Morale and general fitness of crew personnel, preparatory and sub- 
sequent to combat duty, played a vital role in maintaining top ef- 
ficiency. The need for stimulating and diverting environmental changes 
was recognized early in the war, mainly due to the many combat 
reports which stressed such factors as positive measures in the pre- 
vention of fatigue and other mental disorders. 

Tropical or sub-tropical climates with their heat, humidity, and 
frequent rainfall were novel to fliers, and had definite effects upon them 
• -producing run-down physical conditions, loggy feelings, and lazy 
attitudes (3). Semi-civilized countries, such as was the case in the 
South Pacific area, that lacked even a semblance of modern sanitary 
facilities, constituted a constant menace to tlie health of the men. 
Because of geographical location and difficult supply lines, food and 
nutrition in general became another problem in keeping the morale and 
general fitness of the men at a high level. Repetitous diets with little 
or no variation acted to make men disgruntled and malcontent, and 
made for the development of unwholesome attitudes with respect to 
their duties. Dougherty reports: 

During the month of November 1942 physical fitness in the pilots under- 
went a rapid decline. There was an acute shortage of fresh fruits, meats, and 
vcgetai>les, and the foodstuffs that were available contained many gas forming 
foods. Combat missions had to be flown in the morning and in the afternoon, 
and for a period of three weeks the chief food for the noon meal was a stew 
which was highly seasoned and which contained considerable gas forming 
elements. As a result there was a great increase in minor gastro-intestinal com- 
plaints such as heart-burn, gastric-distress, and at times nausea. The constant 
repetition of certain foods caused a distinct distaste for them and a very definite 
loss of appetite, and the food was not eaten. Consequently a diet which was 
satisfactorily balanced became one which did not meet the nutritional demands, 
and malnutrition characterized by weight loss developed. Vitamins were avail- 
able in the foods and in the supplemental multi-vitamin capsules; however, 
vitamin C was inadequate and remained so throughout the period. Fortunately, 
some fresh food supplies arrived in the early part of December, and the physical 
condition of the men improved (3, pp. 36-37). 

In the European theatre of operations, the over-all situation w^as not 
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radically different from that in the South Pacific. For example, I found 
sanitation facilities in South Italy (from which a greater proportion of 
combat missions were flown) were none too good. Veneral diseases 
were high because of the highly infected population living close by to 
operational fields, and recreational facilities in near-by towns were in 
in many instances limited due to the war- torn slate of affairs. Toward 
the end of the war, the pace of battle progressed so rapidly that men 
were not certain about where they were going to sleep from night to 
night. The constant demands of battle tactics kept most flight per- 
sonnel constantly on the move without much opportunity for recreation 
of a too varied sort. 

In short, factors which wx're in many instances completely out of 
the control of army commands often contributed towards making the 
lives of combat men monotonous and void of any diversion from the 
nerve-wracking business of combat. This lack often, while not acting 
as the precipitating cause of operational fatigue, certainly contributed 
important aspects to its development and fruition. 

Emotional Stresses 

Probably the most important aspects of the personality structure 
of combat personnel were those to which w^e commonly refer as emo- 
tional. In any given combat situation, physical and environmental 
stress void of any emotional contributing elements is without meaning. 
The effect of enemy gun fire, for example, produced considerable ten- 
sion and is in one sense to be looked upon as a physical stress; yet in a 
stricter sense the emotional ^'charge’* which such a condition produced 
was more manifest and effective. Among these conditions the following 
are important: 

1. Enemy aircraft may be encountered and may result in seriously wounded 
or killed crew members; or “bailing out*' or crash landings in enemy territory; 
flying over mountains, over jungles, and over water. 

2. Geographic and climatic conditions, such as missions through storms, 
overcasts, and all sorts of altitude conditions that conjure up all sorts of possi- 
bilities as to the final outcome of the mission (for example, getting lost, crash 
landing, running out of fuel, parachuting, and the like). 

3. Base conditions, that is, location of bases with reference to enemy opera- 
tions; how frequently they are subject to attack by enemy bombers, strafing 
raids, and the like. 

4. And finally, each and every member of a crew feels responsibility toward 
his country and the principles for which he is fighting; that is, the men feel 
the necessity of accomplishing a mission not because they are given credit on 
the “boai for it but because they see it as a stab at the enemy. Their par- 
ticipation means that many more bombs. Members of a crew hold themselves 
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responsible for the fear they manifest, not so much of the enemy fighters or flak 
that they undoubtably encounter as of what their buddies and squadron 
organization will think of them. 

The requisites of a battle situation suggesting any of these afore- 
mentioned possibilities produce extreme emotional stress in all con- 
cerned. These are the problems which properly set the stage for emo- 
tional conflicts — conditions and situations in which the factors of en- 
vironment, physiological and emotional stresses, and so on are mutually 
reinforcing. 

Responsibility. The responsibility which command members of a 
flight crew feel for subordinates is probably one of the most potent 
motivating factors operative in personal relations. Docs a pilot, for 
example, consider himself adequately responsible for the safety and 
well being of his crew? On non-opcrational flight duty, there is little 
emotional stress on the part of a pilot in this one respect. There is 
aclually little need for it, since such missions do not call upon the pilot 
to exercise anything above ordinary operational control over the 
members of his crew. However, in battle, the situation is quite dilTer- 
ent. In addition to routine flight command responsibilities, he must also 
display judgment about the battle situation, he must effect decisions 
of paramount importance, bearing in mind at all times his responsibility 
to the command and its mission (of which he is a vital part) and to his 
crew members. At times, the emotional burden which this responsibility 
imposes is exceedingly heavy, and there are many instances of pilots 
(as well as other members of a crew) breaking under these demands. 
Wrong judgments are feared most of all, for the decision formulated in 
battle must be quick and accurate if the crew is to survive. Supposedly 
the selection of pilots for combat has been determined by a man's 
capability of exercising such responsibility without serious psychiatric 
consequences. How-ever, this selection is not always effective for as 
we have mentioned earlier, it often happens that apparently stable indi- 
viduals break unexpectedly with disastrous consequences for all con- 
cerned. 

I recall from my own experfence several first pilots who resigned their posi- 
tion to accept a co-pilot's position because, as they claimed, they did not wish 
to assume the responsibility of nine crew^ members. They felt that they did not 
possess the necessary requisites for making quick decisions, for evaluating 
combat situations, and the like, contending that a wrong move on their part 
would bring serious consequences for which they did not want to feel everlast- 
ingly responsible. 

I also recall in connection wdth this discussion the incident involving a 
B-24 pilot who had lost a good portion of his crew by the exercise of a wrong 
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judgment, namely, he pulled his ship out of formation with 3 engines in good 
condition, and thereby became an easy target for enemy fighters. 

The question of responsibility does not apply to the pilot alone, but 
extends to all members of a flight crew. The navigator is responsible to 
his crew to see that they reach their target and that they have a safe 
return route. He must be cautious with regard to his computations, 
the accuracy of the ship’s course, and many other details for which he is 
solely accountable. Mistakes on his part may not only affect the men 
in his own .ship, but in many cases (e.g., when he is in the lead ship of a 
bomber formation) he is performing for a great number of other ships. 
One can readily realize the tremendous emotional stress which a navi- 
gator must undergo who is entrusted with these responsibilities. 

A navigator, during the heat of an intense fighter attack, abandoned his 
position and in great fear and anxiety fell to the floor of the plane for protection. 
The plane was crippled as a result of the attack and it became separated from 
the rest of the bomber formation. The navigator, by his action, had failed to 
keep track of the bomber's course and it turned out subsequently that the pilot 
brought the plane over a flak area and was shot down. The navigator’s neglect 
to perform his duty can be attributed not only to his fear and anxiety, the 
resultants of the battle situation, but to a definite lack of responsibility toward 
his crew males. The literature is replete with cases in which men have been 
overcome with fear and anxiety, but who have refused to give up their position 
because of the responsibility that they had toward their buddies. 

The bombardier is held accountable for dropping his bombs on the 
target, not only that the mission might be considered successful, but 
that repeat missions will not become necessary and thereby further 
endanger a large force of men and planes. Bombardiers, because of the 
stresses of combat conditions, become emotionally upset, so much so, 
that repeat runs on the target become necessary- this entails one or 
more additional runs on the target that may very well be h(?avily de- 
fended by flak. Anxiety and fear, under these circumstances, are magni- 
li('d considerably in all personnel concerned. Bombardiers who dis- 
played those characteristics soon became known and stood out m the 
squadron; their unfavorable reputation often caused them to become 
shy and seclusive; to complete the picture, the typical symptoms of a 
war neurosis became evident. 

On a bombing mission directed at a rail-bridgehead located in northern 
Italy, 1 recall the following details pertinent to our discussion. This was a 
very heavily defended target, since it was a main line of supply for the German 
forces in Italy. The task was to bomb the bridge and thereby hamper the 
supply flow. The group bombardier had had considerable experience in combat. 
It turned out that three bomb runs were made on this target; losses were much 
higher than necessary. The feeling of tension release which one experiences 
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after making a bomb run (in getting out of a flak area) cannot be minimized. 
In repeating the bomb run twice, with the enemy’s flak becoming more and more 
accurate on each succeeding trial, I venture to say that not a small number of 
men would have gladly shot the lead bombardier. 

Gunners in all positions were held responsible for defending the ship. 
This task was very important for interference from enemy aircraft 
could and often did cause a mission to fail. Gunners, therefore, held 
a unique position in preventing harm from coming to their buddies and 
were indirectly responsible for the success or failure of the mission. 

The following details of a mission were reported to me. Returning from a 
mission, it was necessary for the group to cross the Adriatic Sea in order to 
reach their base of operations in Italy. It was standing operating procedure 
that upon reaching the sea, all gunners would be relieved, since there was then 
little danger of attack by the enemy. It was therefore customary to raise the 
ball gunner. However, on this particular occasion, for some unknown reason, 
the other gunners whose responsibility it had been to raise and release the 
ball gunner failed to do so. The ship ran out of fuel and it had to be ditched. 
It was too late to do anything about the trapped gunner. 

A second case concerns a nose gunner in a B-24. It was customary upon 
approach to the bomb run area for each crew member to don his flak suit. In 
the case of pilots, nose, and ball gunners, they are helped into their suits by 
nearby crew members. In the case of the nose gunner, it is the responsibility of 
the navigator to do so. Upon reaching the target area, the gunner began 
screaming over the interphone for his flak suit. The navigator was busy chart- 
ing the bomber’s course. It w'as the navigator’s feeling that at the time he was 
in no position to give attention to anything other than his job. An 88 mm. shell 
burst just ahead of the ship and a piece of flak killed the gunner. 

In the case of the fighter pilot, one would be inclined olThaiid to 
think that they had little responsibility to exercise — perhaps, only in 
the sense of completing a strafing mission, or a dive bombing mission, 
or just escort for a group of heavy bombers. Upon closer examination, 
however, it will be found that their responsibilities required as much of 
them as was the case with bombardment personnel. 

Tactical expediency in the case of fighter formations necessitated 
the use of two-plane elements. There was a lead plane and a wing ship. 
In this manner, it was difficult for an enemy plane to sneak up behind 
one without the other being aware of it. In combat, the lead plane did 
most of the aggressive fighting and the wing plane afforded protection. 
Credit for destroying enemy aircraft went to the lead ship by virtue 
of its functioji. Honors in battle were strongly vied for and intense 
jealousies were evident between lead pilots and wing pilots, since it was 
held 'that undue credit was being given to the lead pilot. The need, how- 
ever, for this arrangement of planes in combat w-as manifestly neces- 
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sary for the safety of all concerned and the necessity for having a 
responsible wing pilot was very apparent. Unfortunately for both pilots, 
the wing pilot often forgot this responsibility and sought after a little 
glory of his own, with the result that both planes would suffer the almost 
inevitable consequences. 

A second responsibility of fighter pilots was manifest on cover or 
protective missions, that is, in escorting heavy bombers and protecting 
them from enemy fighter opposition. The need for this cover was es- 
sential and bomber personnel were always very grateful because of it. 
Failure on the part of fighters in meeting the bombers at the proper 
time led to serious consequences. A bomber could not weave about the 
sk}’ in wait for its cover, since fuel consumption was a vital problem in 
every attack and target time was strategically set. The result of any 
failure of fighters and bombers to coordinate was evidenced by the 
high bomber losses incurred on the raid. 

Fighter pilots were always attracted to grounded enemy planes and other 
land targets. Their failure to meet pre-arranged schedules with bombers was 
often due to this fact; taking time to strafe grounded craft, supply trains, and 
other targets delayed their time to such an extent that for all practical purpovses 
they were useless. 

In sum, responsibility as an emotional stress entails a number of 
factors. It consists primarily of an alertness, apprehension, and atten- 
tion to the many details that go to make up the job that has to be 
performed. It is not just a realization that so much has to be accom- 
plished; rather it is a knowledge of the necessity of performing that job. 
It consists of a concentration of energies and attention upon the rami- 
fying points entering into performance of duty as a whole. Failure to 
achieve any of these components ultimately leads to a conscious disa- 
vowal of responsibility or unconsciously to gross mistakes that in the 
final analysis must be interpreted as failure to act responsibly. The 
end product of all of this is, of course, some sort of mental conflict. 

The concept of responsibility may be looked upon from another 
point of view, namely, lack of responsibility as a function of or symptom 
of an already distorted personality. Thus, as was the case many times, 
crew members who had already acquired conflicts of one sort or another, 
such as, anxiety states, fears, and the like were by reason of their con- 
dition unable to act in a responsible manner. They were not only a 
danger and detriment to themselves and their crew mates, but in a 
larger sense to the entire formation of which they were a small part. 

A pilot failed to pull out of formation after being hit by a flak burst. It 
was standing operating procedure that when a ship was seriously hit and there 



OPERATIONAL FATIGUE IN THE ARMY AIR FORCES 


47 


was some danger of explosion, that the pilot was obliged to get the ship out of 
the formation in which it was flying. This pilot neglected to do so; the result 
was that he blew up and caused a number of other ships to go down with him. 
Upon subsequent investigation, the following facts concerning the pilot were 
brought to light: he had somewhat over 35 missions; he had been of late speak- 
ing of his extreme fearand anxiety of being hit by flak and catching on fire; he 
had exaggerated anxiety symptoms before being briefed, and had expressed 
the hope before briefing time that the mission for the day would be an easy 
one; and the like. There is little doubt that this pilot was mentally incompetent 
at the time he was flying the mission. 

The weight of responsibility, its conditions and demands was at 
times so highly charged that emotional stability was made impossible. 
On the other hand, responsible behavior could not be exacted from some 
individuals since they had already succumbed to some form of person- 
ality disorder. 

Fear, It was frequently asserted by combat personnel that there 
was no one man who at some time or another did not experience the 
harsh feeling of fear before or during a mission. The truth of this state- 
ment becomes apparent when one takes into account the many mani- 
fold experiences to which the combat man was constantly subjected. 
There is then little wonder that we list fear as one of the major stresses 
operating at all times to distort and disorder the personalities of combat 
men. It is not so much the fact that fear operates independently of any 
other stresses already mentioned or to be mentioned; rather it is the 
unique manner in which it manifests itself during the daily routine of 
flight personnel which constitutes fear a special type of problem. The 
common experiences of flak, gun fire, explosions, and the like were in 
themselves great dangers and were capable of producing fear reactions; 
however, in a great number of instances, anticipatory fears of situations 
produced more marked reactions (5), Perhaps it was the mystery con- 
nected with these possible episodes (c.g., bailing out, ditching, being 
strafed, etc.) which made them all the more fearful. 

The development of fear follows a definite course; a course which 
parallels the combat career of every person so engaged. In a number of 
cases, persons responded to fear in new and resourceful ways and eventu- 
ally overcame it; in othei' instances, it was fear that won out. Hastings 
describes the development of behavior in which fear plays a most im- 
portant part: 

1. On arrival at an operational station the men were first of all insecure and 
defensive, both consciously and unconsciously, and this was apparent in many 
ways. In action they were cither overly self-assured, or particularly diffident, 
usually the former. In speech they might be either loud and continuous, or in 
a few cases “mouse-quiet.” They either spoke continuously of combat or 
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avoided it completely. They either ridiculed and paid no heed to any advice 
by experienced men or they took in every possible word of it. They tended at 
first to drink more than the others. They did not accept the possibility that 
they would ever be afraid, and openly spoke of the older men who mentioned 
fear as being “flak-happy” or spiritless. It was quite easy to spot a new group 
of officers at a table in the mess or in a group in the lounge, even if one knew 
none of ihe personnel of the station. 

2. These defensive attitudes and mechanisms began to disappear quite 
quickly after the men had four or five raids, and had seen and felt the real 
factors in the combat situation. This process was frequently conscious to a 
large degree, and in many cases, the men spoke of the change in themselves, 
and shamefacedly deprecated their former cocky attitude. At about this point 
it was frequent for them to “over-swing,” and to be very conscious of their 
anxiety, having somatic symptoms, and paying attention to them, and feeling 
quite hopeless about their chances of survival, sometimes in consequence, get- 
ting careless of technique, equipment and the like. 

3. The third stage of evolution took place at roughly the tenth raid, by 
which time one or more of several factors had helped effect a further change: 
the man had experienced fear and by now knew that he could deal with it; 
he found that care and skill and coolness in the pilot and crew had a real bear- 
ing upon the (luestion of his return; he saw that his crew and his airplane could 
withstand catastrophe; he developed an “esprit de corps” in regard to his 
squadron, and was now really part of it. He developed for the first time a 
sense of his responsibility to his mates, and to the formation. At this stage 
which continued sometimes until the end of his tour, the men were effective, 
careful, fighting men, quiet and cool on the ground and in the air. They at- 
tained a sort of tranquility in spite of their anxiety. They had very little need 
for defensive mechanisms of any sort to deceive themselves or anyone else. 
They talked easily and quietly, drank little except on pass, and expended 
virtually all of their attention and interest on the job. When they did go on 
pass and over-indulged they usually did so in a peculiarly deliberate way, be- 
lieving over-indulgence was a cathartic sort of release of feelings, which they 
felt to be useful. They were drained of most feelings other than those having 
to do with combat. No values existed other than those meaningful in combat. 

4. Frequently a fourth stage of evolution gradually took place by the be- 
ginning of the last five raids. Its components were probably to a large extent 
physiological, the results of continuous prolonged fatigue and fear. It con- 
sisted, in its most extensive form, of a state of insomnia, fatiguability, weight 
loss, anorexia, indifference, restlessness, loss of concentration and interest and 
efficiency, marked irritability, loss of libido, and a fairly marked depression 
with retardation. It did not necessarily include all of these components, but 
perhaps only several of them; these symptoms could not be considered m the 
category of neurotic manifestations, in that they did not cloud the real issue 
from the flier’s consciousness. He had usually complete insight into their 
cause and mechanism. They also did not give him any “secondary gain,” in that 
he did not accept the release from combat that they might have afforded (5, pp. 
20-25). 

The element of fear may be the result of any of the many factors 
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encountered in combat flying. Some feared take-offs, for in many types 
of combat ships the stability of the craft decreases with bomb load and 
increased use in battle. Stories became numerous of accidents which oc- 
curred on take-off and the results of these accidents often acted to 
create phobias in inexperienced pilots. An engine may fail on take-off, or 
a tire may blow out as the plane rushes down the runway — with a full 
load of gasoline and thousands of pounds of bombs such incidents be- 
came tragic affairs. And those w^ho have witnessed such happenings 
readily know the mask of fear on the faces of crew' members as they 
rushed from the burning plane, which, in a matter of minutes, will be 
blown to bits by the exploding gasoline and bombs (2). 

Night missions w^ere especially feared by most combat men. The 
uncertainty of night flying and its added hazards made for extreme 
tension and anxiety. Such conveniences (considered in the United 
States as common necessities) as lighted runw^ays, guide markers, for- 
mation lights, radio aids, etc. were not routinely employed in combat 
for reasons of security. This lack made night take-offs in flying ex- 
tremely hazardous, and pilots as well as other crew members reacted 
accordingly with signs of fear and anxiety. The proportion of accidents 
between daytime and nighttime flying was consistently higher in the 
night missions. 

On daytime raids, the fear of take-off is just as intense as it is on 
night missions. Perhaps the most critical moment for all crew members 
simultaneously was that of the take-off. One is sure of a certain degree 
of success in connection wdth the mission as a whole once the ship be- 
comes air-borne. Erroneous as this feeling may be, the psychological 
boost which occurs in combat men once they have become air-borne is 
not to be minimized. Watching a take-off mishap impresses one with 
the extent of helplessness manifested by the crew members involved. 
In a sense, this is not true of the many other dangers connected with 
flying, wherein certain defensive measures can be employed. The alti- 
tude attained on take-off docs not allow for freedom of maneuver or the 
use of adequate safety measures, and when an accident occurs the end 
is inevitable: ships arc completely blown up or consumed by fire in the 
space of a few minutes. The observer is impressed with all of this, and 
in some respects there is a certain amount of comfort in the rapidity of 
it all. Many combat men become fatalistic after a short time in combat, 
and experiences of such accidents seem to offer some satisfaction in the 
hope that their end, if it should come, might be a similar one. 

ThAt take-off and landing accidents were traumatic situations for 
the production of fear reactions certainly is undeniable and there can 
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be little doubt that almost every combat man at one time or another 
witnessed such occurrences. This being so, the extent to which these 
experiences acted to produce fear and anxiety cannot be over- 
emphasized. But, as we have already mentioned, combat experience 
along with the general personality configuration acted to minimize or to 
emphasize the final effect produced. 

Fear of flak and fighters formed two other bases for fear of a most 
effective sort. At the beginning of a tour of duty, the fear of flak was 
not very apparent; however, when flak came into close range and com- 
bat personnel actually tasted some of its ^‘bitter fruit,” then fear of it 
became in many instances obsessional. The inexperienced flier did not 
for a moment realize the nature and complete effectiveness of this type 
of weapon, and for this reason new combat men tended to under- 
estimate the destructiveness of flak. The more common notions about 
flak were: to be effective, flak had to burst very close to a plane; that 
enemy flak batteries were not sufficiently accurate to score direct hits; 
and that there were not enough flak batteries around any single target 
area to be cause for alarm. Experience, however, very rapidly changed 
such opinions. It was learned that flak, like shrapnel, was very deadly; 
that its range of effectiveness was wide; that the Germans had their 
guns radar controlled and that a fair degree of accuracy w^as possible; 
and that the number of guns around many targets ran into the hundreds. 
It was not at all surprising to note the extent of change wdiich did take 
place after contact with this weapon. 

It was common practice for new crews to complete their first few missions 
over relatively easy targets, that is, targets which were not heavily defended 
and at which little fighter opposition could be exi)ccted. The rationale of this 
procedure was to bring about a gradual introduction to the rigors of combat. 
As the missions became “tougher,” one could notice a decided change in the 
behavior of ihe men; they became restless and revealed considerable anxiety 
about the type and objective of their next mission; anticipatory fears developed 
connected with their position in the formation which connoted varying degrees 
of vulnerability by flak and fighters; and, in brief, a great many other sympto- 
matic changes of personality attendant upon fear and anxiety. There were 
many times in my own experience when flak was so heavy over a target that 
the expression “you could roller skate on it” was a somewhat accurate descrip- 
tion. Such targets as Vienna, Wiener-Neustadt, Munich, Regensburg, etc. 
were leputcd to have had many hundreds of flak guns for their defense. 

The accuracy which flak achieved was in many instances outstanding. Guns 
were radar controlled and range estimation was thereby easily and accurately 
computed. With any large formation of bombers, in the time that it took two 
or three groups (a group consisted of approximately twenty-five planes) to 
pass over the target, flak was very accurate and deadly. Thus, when forma- 
tion notices were posted, the men would be very anxious to know in what part 
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of the formation they were to fly. As a matter of preventive psychiatry, these 
formation notices were often restricted to the time of briefing, so as not to 
cause undue anxiety among the men who were to fly in the rear echelons. 

To repeat, first contact with flak usually did not arouse many fears 
in combat men, for inexperience with the effects of flak, wrong notions 
concerning its actual and potential danger, and the practice of starting 
new men on relatively easy missions were operative in the early stages 
of a combat career. With every mission, however, crews learned vrhat 
flak could do: physical injury by flak was very painful and many times 
fatal; damage to planes occurred in places which one would never im- 
agine to be vulnerable ; and most effective was explosion due to a direct 
hit in the bomb-bays which set off the bombs or gas tanks. Combat men 
witnessing mid-air explosions had their fears enhanced considerably by 
such incidents. 

An enlisted man who had completed fifty-two missions had an extreme 
fear of flak, the mere sight of which caused him to be paralyzed with fear. 
Often he would “freeze” on the toggle bombing switch and on several occasions 
he would not close his bomb-bay doors until called by the pilot to do so. While 
accumulating his fifty-two missions he had witnessed eleven of our aircraft 
shot down; with the eleventh craft were lost two of his close friends. On the 
day he was grounded and hospitalized his own crew was lost to flak — exploded 
over the target. After hearing of this, he expressed an ardent appeal to dis- 
continue flying altogether. (From a report of a Flight Surgeon attached to a 
B-26 Bomber Group). 

In sharp contrast to the reaction to flak was the effect of fighter op- 
position in the production of fear reactions. Fear of fighters was not a 
common experience among crew members. As it was often put, “At 
least one was able to do something about the damn things;” this was 
not exactly the case with flak. Those men who did react with fear and 
anxiety toward fighters did so because of the unfortunate and disastrous 
traumatic experiences with them. Formation flying was not only de- 
signed to bring about optimum conditions for bombing a target, but 
also to maintain a maximum of protection against fighter opposition. 
The possibility of becoming disabled for any one of a great number of 
reasons and having to leave the protection of the formation loomed large 
as a potent fear producing stress. But the factor of self-defense in being 
able to fight back gave one a chance to alleviate his fears; the personal 
implications of the threatened danger were minimized through a con- 
structive attack upon the danger itself. 

Toward the closing phases of the war our fighter planes were so 
numerolis and the protection so efficient, that enemy opposition directed 
against our bombers became inconsequential. As a result of our superi- 
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ority personal dangers in this direction were considerably lessened, and 
the incidence of fear reactions correspondingly decreased. 

Anxiety, Aero-anxiety ranks next in importance to the factor of fear 
in the production of operational fatigue (8). To repeat, the incidence of 
any of these causal agents often went hand-in-hand and this is especially 
the case with anxiety and fear. Many times it was difficult to differenti- 
ate between fear and anxiety; it was very rare that one occurred without 
the other. This is understandable in view of the nature and conse- 
quences of fear itself. Thus, one of the after-effects of fear is anxiety — 
as manifested in anticipating the same incidents which originally pro- 
duced the fear. With an intensification of the anxiety state there is a 
corresponding increase of fear and a lessening of responsibility. In a 
great measure, all the factors stressed as important in the production of 
loss of responsibility and of fear are likewise operative in the production 
of anxieties. 

Summary 

We have attempted to make clear some of the more important agents 
influencing operational fatigue — defined as abnormal flying strain being 
placed on a normal individual. We have seen that they arc very com- 
plex; that they do not occur in isolation, but rather tend to interact and, 
in a numbci of instances, are all present together. In this respect, each 
becomes a function of every other, and the occurrence of one becomes a 
condition of the eventual appearance of the others. Thus the burden 
of supposedly excessive responsibilities brought on the retardation of 
performance, and prepared for the emergence of fear and anxiety. In an 
analogous manner, experiences which produced fear also brought with 
them anxieties and loss of responsibility. 

The question of predisposition toward personality breakdown has 
not been treated in this paper for the obvious reason that at the outset 
the assumption was made that all personnel engaged in the hazardous busi- 
ness of flying could succumb to operational fatigue. It is debatable, of 
course, whether environmental, climatic, emotional, and similar factors 
are causative in a true sense, or whether they are the triggers that set 
off neurotic behavior in a predisposed personality. Clearly, the litera- 
ture indicates that those men who did break under the strain of battle 
would in a majority of cases not have done so under the conditions of 
peace-time living. The predispositions which they might have possessed 
would have been of little significance in civilian life; however, because of 
the extreme and prolonged types of experiences in combat, they were 
bound to react in violent ways. No amount of psychiatric screening 
could have eliminated these individuals for their condition was in no 
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sense deep-seated enough to allow for adequate and complete airing. 
Rehm (9) concludes this matter nicely: 

It is a matter of general understanding and acceptance by medical personnel 
and laymen that air crewmen are a highly specialized and trained group. In 
recognition of the severe strain to which these men are exposed, diligent care 
has to be taken to see that they are maintained in the finest physical condition. 
However, other factors, less tangible, which have a direct bearing upon the 
ultimate combat efficiency of these men, have been placed more or less in the 
background. . . . These factors are the psychological disturbances which arise 
in combat flying personnel — the “gremlins** of the mind, in whose grip even 
the strongest willed man is powerless. 
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A REVIEW OF LEADERSHIP STUDIES WITH PARTIC- 
ULAR REFERENCE TO MILITARY PROBLEMS 


WILLIAM O. JENKINS 
Indiana University 

Introduction 

The present report summarizes and reviews selected references from 
the available literature dealing with the problem of the selection of 
leaders in various fields. The primary interest in preparing the article 
was to provide a summary of techniques and results that would be of 
value to psychologists dealing with problems of selecting leaders, par- 
ticularly in the military field.* For this reason, no attempt has been- 
made to cover the extensive literature concerning dominance and 
''leadership” with other than human subjects. 

The primary factor considered in selecting items for inclusion in the 
present article, in addition to their relevance to military selection prob- 
lems, was whether or not the material was empirical in nature. For 
illustrative purposes a few speculative reports have been included, but 
the main emphasis is on the presentation of research findings. 

No attempt is made to treat the theories of leadership wliich have 
been proposed. Few of these theoretical expositions have presented hy- 
potheses concerning the various aspects of leadership which are testable. 
Furthermore, none of them has been comprehensive and systematic to 
the extent of accounting for the obtained information. In a late section 
of this article certain general principles and hypotheses extracted from 
the available literature arc presented. 

For the purposes of this article the dictionary definition of leadership 
as the act of guiding or directing the behavior of one or more individuals 
may be employed. A more adequate operational definition should de- 
rive from future research on this problem. 

For clarity of presentation the various studies of leadership have 
been divided into five groups: (1) industrial and governmental investi- 
gations, including studies of executives, administrators, supervisors, 
foremen, etc.; (2) studies of scientific and professional personnel; (3) in- 

• This article was originally published as No. 190 in the AAF Aviation Psychology 
Abstract Series issued by the Psychological Branch, Office of the Air Surgeon, Head- 
quarters Army Aii Forces, Washington, D. C. The date of publication was Sept. 20, 
1945. The purpose of the original report was to present typical procedures and results 
obtained as background information for Army Air Forces Aviation Psychologists working 
on problems of leadership. 
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vestigations of the activities of children in pre-school and extra-school 
situations; (4) studies in the school situation; (5) military leadership. 

Industrial And Governmental Investigations 

Two of the earliest studies of the characteristics of business execu- 
tives in this field were published by Gowin in 1915 and 1918 (26, 27). In 
the first study a questionnaire was submitted to approximately 1,000 
executives, 225 lesser executives, and 200 professional personnel. Data 
were also gathered on 222,000 insurance policy holders. Information 
concerning various characteristics of these individuals was requested. 
Measures of statistical significance were not presented, but differ- 
ences between the major executives and the insurance policy holders 
were reported with regard to height and weight, favoring the former 
group, and the executives were reported to have been subjected to some- 
what stricter selection than the professionals as indicated by a lowxr 
coefficient of variation. 'J'he executives did not differ greatly from the 
other groups with regard to age at marriage, number of offspring, and 
similar items. ' ‘Attitude tow^ard life work” was rated by the investi- 
gator on a ten-point scale from ten (continued one line of activity 
throughout life) to one (changed line of activity three or more times). 
The results for approximately 200 business executives were compared 
w ilh a control group of “professionals.” No outstanding differences ap- 
j)cared. 

In Gowin ’s second study 276 business executives were asked to rank 
the importance of a number of qualities for administrative ability. The 
qualities ranked at the top were judgment, initiative, and integrity; 
those ranked lowest were refinement, appearance, and sense of humor. 
Laird's (33) findings were similar to these. No definitions of these con- 
cepts were presented nor was further use of the data reported. 

A number of other studies involved the administration of question- 
naires to business executives. In one study, Taussig and Joslyn (55) 
investigated i he social classes from which American business leaders arc 
recruited; attempted to determine the proportionate contribution of 
each social class to the supf)ly of business leaders; and proposed to study 
the relative influence of heredity and environment on such disparities 
as might exist betw^een the representation of the classes among business 
leaders and their representation in the population at large. A sample of 
15,000 business leaders was selected from a register of directors and 
complete returns were received from somcwdiat over 7,000 cases on the 
nine-item questionnaire w^hich was employed. The modal individual 
emerging from the reports of this study may be described as follows: 
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president (for less than ten years) of a manufacturing or mining organi- 
zation, the gross income of which is between one million and five million 
dollars per annum; a background of living in a community with a popu- 
lation over 500,000 in New York State; between the age of 51 and 52 at 
the time of questioning and between 41 and 43 at the time of entering 
the business; a college graduate with no formal business training who 
reports little or no help in starting in business. 

The authors conclude that the results suggest that lack of native 
ability plays a greater role than lack of opportunity in the failure of the 
lower occupational classes to be as w^ell represented in the sample as the 
higher classes. The evidence, however, was negative rather than posi- 
tive in nature. 

Starch (54) sent a questionnaire to fifty executives making salaries 
over $50,000, to 50 making between $7,000 and $50,000, and to 50 with 
salaries below $5,000. In the opinion of the members of these three 
groups, four characteristics of executives were important: ability to 
think, inner drive, capacity to assume responsibility, and ability to 
handle people. The proportion of the high salary group rating ability 
to think as important was 72%; the percentage of the middle salary 
group considering this aspect important was 58% ; and the percentage of 
the low salary group was 46%. For the other characteristics the per- 
centages, respectively, were: inner drive 58%, 42%, and 20%; capacity 
to assume responsibility 68%, 54%, and 20%; and ability to handle 
people 84%, 72%, and 86%. On the basis of these results Starch pro- 
posed a formula to account for differences in earning potential where 
‘‘executive achievement” was the product of “drive” and the sum of the 
factors of “ability to think,” “capacity to assume responsibility,” and 
the “ability to handle people.” Only the results of the questionnaire 
were presented, as given above, and no additional data were reported to 
support the use of such a formula. 

An investigation was made by Sorokin (53) of the biographies of 
1,600 leaders of labor and radical movements. Leaders were classified 
according to their socio-economic background and similar factors. The 
findings indicated that the majority of the labor and radical leaders 
come from lamilies in which the paternal occupation is professional, 
business, or managerial. 

A somew hat different survey approach was employed by Likert (34) 
in a study in wdiich the relationship of managerial attitudes to the 
morale of life-insurance salesmen was investigated. In this study, ap- 
proximately 300 insurance agents in twenty different agencies were 
interviewed systematically and each agent filled in a questionnaire con- 
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ccrning his attitudes toward the firm, the management, etc. Each agent 
received a morale score based on the interview and one derived from the 
questionnaire, with the two scores correlating .85. 

The agents indicated that the manager was the chief influence on 
their morale. The major characteristics of the manager which were 
considered important by the agents in this respect were: his attitude 
towards them; his personality in general; and his professional skill. 

A number of investigators have attempted to develop test batteries 
to select executives or administrators. Among these may be listed 
Cleeton and Mason (12) whose battery of tests carries the rather elegant 
title of “Vocational Aptitude Examination for Sales, Technical, and 
Executive Groups.” It was reported that this battery had been ad- 
ministered to personnel in executive positions, but no data were pre- 
sented in this regard. 

Uhrbrock and Richardson (58) administered a battery of nine tests 
to 163 supervisors. Ratings by superior executives were employed as 
criterion scores. Out of a total of 820 items in the nine tests, only 85 
were found to have significant predictive value. The majority of the 
valid items were drawn from company information tests. The following 
personal history items were «ilso significant in this study: age, schooling, 
and military service record. 

In another industrial investigation Bridgman (8) performed a follow- 
up study of the success of 1,310 college graduates in the Bell Telephone 
System. The criterion involved was salary achieved in the company. 
The findings indicated that high scholarship, campus achievement, early 
graduation, and immediate employment in the Bell System were all 
significantly favorable factors for success in this company. Scholarship 
level during the college career appeared to be the most significant factor. 

A rtudy of 100 presidents and vice-presidents of “successful” com- 
panies has been reported by O’Connor (44). It was stated that five 
characteristics of executives were isolated and measured in the course of 
this research. The five items listed were: (1) large English vocabulary, 
(2) many aptitudes, (3) objective or extremely objective personality, 
(4) accounting aptitude, and (5) aptitude for first position. Descriptions 
of the instruments, conditions of administration and scoring, and data 
supporting this statement were not presented. 

Beckman and Levine (4) studied the Allport A-S Reaction Test, the 
short form of a Personality Inventory (C-2), and a Directions Test in 
an attempt to predict supervisory ability. Various efficiency ratings 
were employed as criteria for an experimental and control group of ap- 
proximately thirty cases each. On the basis of a correlation of .33 be- 
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tween supervisory ratings and scores on the Allport Test in this small 
sample, it was concluded that this test is of value in selecting executives. 

Along these same lines, a study has been performed concerned with 
the administration of a mental alertness test to t^^nty-eight minor 
executives in a clothing manufacturing establishment (31). These men 
were rated by six superior executives, and it was reported that close 
agreement was found between the test scores and the ratings given the 
men by their superiors. As in the reports of many of these investigations 
the details necessary for critical evaluation were omitted. 

In contrast to these studies reporting appreciable correlations of 
predictive instruments with ratings by supervisors for executives and 
administrators in industry are results reported, among a number of 
others, by Bingham and Davis (7). In this study, an intelligence test 
was administered to 102 business executives, and experience records 
containing personal information about the individuals were obtained 
from 73 of these men and employed as the criterion. No agreement be- 
tween the two sets of data was found according to the aiitliors. 

Thurstone recently studied administrative ability in governmental 
work (57). A group of perceptual tests which had been shown to dif- 
ferentiate a small group of campus leaders from non-leaders was em- 
ployed along with several new tests that were assumed to predict 
administrative ability. In the first phase of this study, a group of in- 
terns in governmental administration served as subjects. The ten 
interns with the highest supervisors’ ratings of professional promise and 
success and the ten with the lowest ratings were selected and their test 
scores compared. The tests which were found to be most discriminating 
were Gottschaldt Figures, Street Completion, Kohs Block Designs 
(negatively), and a Two-Hand Tapping Test. 

In another phase of the study, test scores were compared with a 
salary criterion corrected for age. For 127 administrators whose .salaries 
ranged from less than $3,000 to $10,000 per year, the best single test 
was the linguistic score on the Psychological Examination of the Ameri- 
can Council on Education. Gottschaldt scores yielded a prediction 
which was almost as accurate. A Classification Test involving categori- 
zation of cards with names printed on them, differentiated in that the 
higher paid administrators used fewer categories and fewer single card 
groups. On a test requiring that numerical estimates be made on the 
basis of common knowledge, such as the approximate population of the 
United States, the more successful men on the criterion did better, par- 
ticularh in terms of the proportion of acceptable answers. The following 
scores on the Allp)ort-Vernon Study of Values differentiated the two 
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groups: social, theoretical, economic, and religious. The successful men 
excelled on the first two and were inferior on the last two. They also 
had more masculine scores on the Terman-Milcs Schedule for 
Masculinity-Femininity. On the Thurstone Vocational Interest Schedule, 
the administrators who had higher incomes made lower scores in physi- 
cal science, physical activity, and commercial interests. 

Richardson and Hanawalt (SO) compared the Bernreutcr Personality 
Inventory scores of 258 business men divided into groups of office 
holders and supervisors on the one hand and non-office holders and non- 
supervisors on the other. Office-holders and supervisors tended to be 
less neurotic, less introverted, more dominant, more self-confident, and 
more self-sufficient than the control groups and the norms. The dif- 
ferences were statistically significant for a majority of the measures. 

The use of another procedure that purports to pertain to the selec- 
tion of executives is illustrated in a German study by Luithlen (35). 
This investigation employed a test in which single words were printed 
on individual cards, there being two sets of cards one blue-bordered and 
the other red-bordered. Subjects were paired and one member of each 
pair was given the blue-bordered cards and the other the red-bordered 
ones. Each pair was requested to construct original sentences with the 
cards. In addition to general observations, the measures included time, 
number of constructed sentences, and color and position of the cards. On 
the basis of initial results, individuals were labelled as leaders or fol- 
lowers, and pairs of leaders and followers were given the same problem. 
It was found that when a leader and a follower were paired, the leader 
did most of the work, and when two followers performed, it took longer 
to reach the result, but there was cooperation. When the individuals 
were both leaders, fewer sentences were constructed. Statistical evalua- 
tion of the findings was not presented. 

Studies of Scientific and Professional Personnel 

Investigations similar to those listed above have been conducted 
concerning the characteristics of outstanding scientific and professional 
personnel. Several studies.(ll, 47, 51, 60, 64) have dealt with the char- 
acteristics of American inventors and have employed a questionnaire or 
some sort of biographical approach which yielded various normative 
data concerning the characteristics of these persons. 

Typical studies of the characteristics of geniuses, utilizing similar 
methods, are contained in reports by Ellis (16) and Giese (25). Infor- 
mation from these investigations has indicated superiority over the 
general population for the specialized groups with regard to such items 
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as socio-economic background and education. They also agree in giving 
ages 25-35 as those of the first achievement for scientists, inventors, and 
literary men. 

An interesting study has been performed by Schneider (52). The 
hypothesis under investigation in this study was that any group of 
distinguished men formed a “birth galaxy” associated with the cultural 
situation which, in turn, is related to a definite period of time affecting 
this entire birth group. The individuals studied consisted of approxi- 
mately 200 English botanists born between 1700 and 1920. On the basis 
of published biographical information concerning these individuals, two 
outstanding periods of activity were uncovered. The conclusion drawn 
from the available data was that there occurred a period in the history 
of British botany when there were fewer persons born who became 
famous botanists than for the immediately preceding or following pe- 
riod. It appeared improbable that any racial factors were involved in 
the decline in the number of botanists from 1700 to 1800 since this 
period was marked by a general rise in the number of births of eminent 
Englishmen. Schneider also found that the fame of military leaders was 
associated with the number of conflicts in which the particular country 
was engaged. 

A study has been reported by Flanagan (19), which pertains indi- 
rectly to leadership, concerning the validity of the 1940 edition of the 
Cooperative Test Service National Teacher Examinations. These ex- 
aminations consisted of the following eleven sections: Reasoning, Eng- 
lish Comprehension, English Expression, Current Social Problems, 
History and Social Studies, Literature, Science, Fine Arts, Mathematics, 
Professional Information, and Contemporary Affairs. 

In order to validate these examinations 49 teachers in 22 school 
systems who had taken the tests were selected in such a manner as to 
maximize the range of scores (20). School supervisors then rated the 
teachers on a graphic rating scale consisting of 50 items, and students 
filled in an educational-report form concerning them. The product- 
moment correlation between the supervisors* ratings on the character- 
istic of “overall judgment of teachers* general efTectivcncss and desira- 
bility’* and score on the examinations w^as .51. Positive correlations 
were found between all of the 50 rating items and total score on the 
tests. The lowest correlations were for the following ratings: teacher’s 
health, physical appearance, and poise; energy, enthusiasm, and drive 
in school work; quality of speech and voice; sense of humor; congeniality 
of teaclv -’s adjustment to associates; neatness of teacher’s work in 
classroom; and integrity of teacher’s character. It was found in this 
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small sample that the Current Social Problems section of the test yielded 
the highest correlation with the supervisor’s over-all ratings. 

A correlation significant at the five per cent level was found between 
total score on the examinations and the proportion of the students re- 
porting a particular teacher’s name in response to the question ‘‘Which 
teachers seem to have a broad knowledge of other subjects besides the 
one you had with them?” In addition, the correlation was significant at 
the five per cent level between a given teacher’s score on the English 
Expression Test and the proportion of the same teacher’s students who 
mentioned his name in response to the question, “Which teachers were 
most clear in presenting their ideas?’’ 

Pre-School and Extra-School Studies Involving Children 

The three major techniques which have been employed in the studies 
of leadership among children are: (1) observation, (2) nomination of 
companions for leadership positions, and (3) special test techniques and 
situations for identifying leaders. These three techniques are illustrated 
by several studies in the following paragraphs. 

In a study by Parten (45) nursery school children were observed for 
60 one-minute periods and their behavior recorded. There appeared to 
be a greater difference in leadership ability among the children than 
might be attributed to age alone according to the author. In addition, 
it was concluded that the leaders were above average in intelligence. 

The nominating procedure was employed in a study at the Detroit 
Teachers College (64). In this investigation, more than 5,000 grammar- 
school children were asked to write the name of another child that they 
considered their best friend or would like to have as their best friend and 
the reasons for such a choice. The children were also asked to nominate 
the individuals they would like to have as leaders and present reasons. 
There were some outstanding differences reported between the reasons 
for selecting leaders as contrasted with those for selecting friends. 
Ability and achievement appeared to be the main reasons given for 
selecting a particular individual for a leader, and social characteristics, 
such as good sportsmanship, were much less significant for the leader- 
ship choices than they were for the friendship selections. It was found 
that a small number of children, about two to five in a given grade, re- 
ceived most of the votes. 

In another study involving the nominating technique. Partridge (46j 
studied leadership in a sample of 226 Boy Scouts. The five-man-to-man 
rating technique was employed in which each subject rated every other 
individual in the group on leadership ability in groups of five cases with 
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the names presented in random order. The characteristics of those se- 
lected as leaders were recorded and compared with the ratings. It was 
found that leaders were generally superior individuals in such factors 
as height, weight, intelligence, age, and so forth. Wide individual dif- 
ferences appeared in the characteristics of leaders in different groups. 

Partridge also studied the influence of leader attitudes upon group 
attitudes by having the group fill out a questionnaire and selecting items 
on which differences of opinion existed. A group discussion was held 
with a particular leader in charge and the group was then asked to vote 
again on the questions. The number who changed their opinions after 
learning the attitude of the leader was counted. The results of this 
study showed that the amount of influence leaders had upon group at- 
titudes was greater than that of individuals who were not leaders. The 
differences were statistically significant 

An early study by Terman (56) illustrates the use of special tech- 
niques in studying the problem of leadership. In this investigation 
groups of four children, boys and girls separately, were presented with 
ten pictures and objects on a large card and told that the experiment was 
a memory test and that they could look at the card for ten seconds and 
were then to answer questions concerning it. The answers were recorded 
in the order given. In a second series of tests, different subjects were 
used and the original groups were reformed. P2ach new group contained 
at least one individual who had been a leader in responding in the previ- 
ous series and one who definitely had not been a leader. An analysis was 
made of the number of first, second, and third responses that were made 
to the stimulus cards in comparison with information concerning the 
children. The results on the small sample involved indicated that the 
leaders tended to be larger, brighter in school work, better looking, 
better dressed, more widely read, less emotional, more fluent, of more 
prominent parentage, and more daring. 

Typical of a series of researches by Lewin and his associates is a 
study by Bavelas (3) concerning morale and the training of child leaders. 
The study dealt with the retraining of leaders with low morale. Three 
mediocre leaders on a WPA project, as judged by ratings, were selected 
and equated to a control group of three cases on the basis of age, sex, 
length of time on WPA, length of time on present WPA project, rating 
of technical skill, rating of leadership ability and relevant life history 
factors. All six individuals wxre tested by observing and recording their 
actual behavior with children on the job prior to and after training, and 
by record! <'g the behavior of the children supervised by them. The 
experimental group was trained for two hours a day for a period of 
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twelve days by indicating to them in detail the attitudes, objectives, and 
philosophy of recreational group work. During this time the experi- 
mental and control groups continued their work with the children. Dur- 
ing the fourth week both the trained and the non-trained individuals 
were tested again on the job by the same methods of observation and 
recording. Prior to training, leaders employed an authoritarian form of 
giving commands approximately 60% of the time, and the remaining 
portion of the time responded to the approach of the children. Follow- 
ing training, the experimental group used praise, invested the children 
with responsibility, and revealed their preferences rather than employ- 
ing their previous approach. The number of children working together 
more than doubled for the experimental group while increasing approxi- 
mately 40% for the control group. 

Studies in the School Situation 

As in several of the previous areas treated in this report, three gen- 
eral techniques have been employed in studies of leadership behavior in 
the school situation. These are the nominating procedure, the survey 
procedure, and the use of special test techniques. Several examples of 
each of these procedures are given below. 

In a study which illustrates a type of nominating approach to the 
problem of leadership. Nutting (43) had a group of girls list the reasons 
for selecting certain individuals as captains for their gymnasium teams. 
An examination of the characteristics of these captains indicated that 
they were slightly above average in age, physical ability, and intel- 
ligence, and slightly below average in scholarship. They were noticeably 
high in '‘popularity.” 

The survey procedure is exemplified in a number of studies. Cald- 
well and Wellman (10) found that leaders in different activities ex- 
hibited different characteristics; for example, boys chosen as class presi- 
dents or other representatives were taller than average w^hereas the girl 
leaders wx're approximately average in height. Practically all leaders 
were high in scholarship, and although physical achievement was an 
important characteristic of qthletic leaders, it did not appear to be rele- 
vant for other types of leadership. According to the results of a rating 
scale for extroversion, most of the leaders were more extrovert than in- 
trovert. Studies by Bellingrath (5) and Brown (9) employed similar 
procedures and yielded comparable results. 

Zeleiiy (62) tried out several different procedures for selecting 
leaders in a group discussion situation. The first technique tried out was 
"identification by voice and appearance” which was reported to be suit- 
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able for speedy preliminary selection of leaders as judged by the super- 
visor of the group discussion. After groups of four or five individuals 
were formed, the '^status-ranking technique'' was used in which each 
individual ranked all others in his group and the average rankings were 
checked against special ratings of leadership by the group and by the 
group supervisor. This technique was reported to show some relation- 
ship between those selected by the group and the professor’s ratings of 
the individuals on leadership qualities. The third procedure employed 
was the five-man-to-man rating technique. In Zeleny’s situation a co- 
efficient of .72 was reported between such ratings and the ratings by 
the faculty leader of the group discussion for a sample of twenty-one 
cases. The degree to which the faculty representative was familiar with 
the ratings of the students was not reported The final technique was 
known as "sociometric identification" in which each individual listed 
five choices for leader of the group discussion in order of preference. 
According to the author, by this technique one could determine those 
most chosen and also those with whom both leaders and followers would 
most like to work. 

In another study by Zcleny (63), the fivc-man-to-man and status- 
ranking techniques were found to yield ambiguous validation data. 

The extent to which several hundred girls desired to live and work 
with one another was rated by each member of the group in an investiga- 
tion by Jennings (29). This procedure yielded a test-re test correlation 
after eight months for positive expression of choice of .65 and for re- 
jection of .66. It appeared that those selected as leaders, to a very 
much greater extent than the average member of the group, construc- 
tively contributed to enlarging the social field for participation of other 
persons. The findings indicated the existence of many individual dif- 
ferences and an overlapping of characteristics between the leaders and 
non-leaders. 

A factor-analysis study of the personality of high school leaders has 
been performed by Flemming (21). The criterion of leadership involved 
the assigning of points based upon positions of leadership held by 71 
girls in grammar school. The predictors consisted of a check list of 46 
traits to be filled out by the teacher for each girl, a ten-point scale con- 
cerning "the intensity of pleasant feeling" that each girl "subjectively 
associated with every other girl in her class," and a rating by the teacher 
on a ten-point scale of "the amount of personality each girl possessed." 
Thurstone’s simplified method was employed for analyzing the matrix of 
intercor ^lations between these two sets of data. 

A correlation of .50 was found between leadership as defined above 
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and personality as rated by the teachers, and a coefficient of .33 between 
leadership and pleasingness of personality as rated by the girls. 

Flemming concludes that there seemed to be four types of leadership 
ability in this group. Scores on the predictors for these selected traits 
yielded a correlation with the leadership criterion of .57 on the same 
sample. No cross application of the weights to a new sample was re- 
ported. 

Thurstone (57) has performed a study in which the scores of 18 
campus leaders on a battery of tests were compared with scores for a 
group of several hundred college non-leaders. One of the most con- 
sistent findings was that the campus leaders were less subject to visual 
illusions as determined by various tests. They gave indication of color 
dominance and superiority on a color-form sorting test, were less subject 
to brightness contrast, and showed a slower rate of alternation of am- 
biguous perspective in the Necker Cube. On the Rorschach, the leaders 
excelled in the total number of responses, were superior in the per- 
ceptual organization score, showed greater latency for color cards, and 
a smaller number of responses to the color cards in comparison to the 
black-and-white cards. 

A comparison of the performance of college leaders in extra-curricu- 
lar activities with the norms for the Bernreuter Personality Inventory 
showed significant differences favoring the former in dominance and the 
latter in introversion (49). Other findings were unreliable or incon- 
sistent. These same authors have reported an item analysis of this in- 
ventory for the responses of 36 women college leaders and 45 non-leaders 
(28). 

Military Leadership 

A considerable amount of literature has accumulated during recent 
years concerned with non-empirical definitions of military leadership 
and speculations concerning the choice of leaders. More recently, mili- 
tary psychologists have devoted research efforts to certain phases of the 
problem of leadership. Typical information in this area is summarized 
below. 

General, A series of books and pamphlets have been published by 
military writers concerning the problems of military leadership (1, 2, 
13, 37, 38, 39, 48, 67, 70). Unfortunately, none of these tomes is based 
on empirically determined evidence and all of them reflect the personal 
opinions and speculations of the authors. Typical of this group of pub* 
lications is a book by Ageton (1) concerning naval leadership. The 
author lists and discusses the following characteristics of leadership 
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which he considers to be of significance: simplicity, self-control, tact, 
honor, adherence to duty, and loyalty. The following types of catch- 
words and phrases are used to illustrate what the author believes are the 
characteristics of good leaders: practice what you preach, be cheerful, 
be a seaman, know your stuff, avoid careless criticism, etc. 

A manual recently issued by the U. S. Bureau of Naval Personnel (67) 
presents a series of principles for younger officers to be used in selecting 
men for promotions and leadership positions. Certain of the principles 
appear to be based on widely accepted psychological postulates, but no 
evidence to support their use in this context is reported. 

The main item of value that can be derived from the speculative re- 
ports on leadership appears to be an interpretation of the various mili- 
tary regulations which pertain to the problems of leading in military 
situations. These interpretations, discussions, and speculations con- 
cerning problems of leadership and how to behave as a military leader 
may be of value in helping to demark an area of research in which prob- 
lems of leadership may be attacked. 

An obvious starting place for a treatment of military leadership is a 
discussion of the procedures employed for selecting officers and officer 
candidates in various countries. Several of these are described in the 
following paragraphs. 

Selection of officer candidates in the U, S. Army, Typical of the pro- 
cedures employed in the armed services of the United States during 
World War II were those used in the U. S. Army. In this branch of the 
service, there were two types of commissioning: (1) direct commission- 
ing from civilian life and (2) receiving a commission following attend- 
ance at an Officer Candidate wSchool. Procedures for selecting U. S. 
aircrew officer candidates have been described in detail elsewhere and 
will be omitted here (for example, see the scries of articles issued by the 
AAF Aviation Psychology Program, this Journal, 1943-1945). 

In the early days of the war, a number of men were commissioned 
directly from civilian life into the U. S. Army. In practically all cases, 
these commissions were for specialists duties, particularly general ad- 
ministration. One factor that played a role in such commissioning was 
previous military experience, either Reserve or Regular Army. For 
such commissioning, there was an age minimum which was 30 in the 
early days of the war, and, in addition, the individual was required to 
pass a physical examination. In most cases the men were required to be 
college graduates. Other than these requirements and that of being 
qualifiv d for some type of specialized duty there were practically no re- 
quirements. A Selection and Review Board examined the available in- 
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formation and accepted or rejected candidates. Occasionally men were 
interviewed by this Board. Practically all of the individuals commis- 
sioned in this manner did not participate in combat operations. There 
was no systematic follow-up study of the on-the-job performance of 
men so commissioned. 

The other type of commission received in the United States Army 
consisted of attending an Officer Candidate School and, upon gradua- 
tion, receiving a commission as a Second Lieutenant. The necessary 
prerequisites for such commissioning were as follows. First, it was neces- 
sary for the individual to make application and to meet certain require- 
ments. Among these requirements were the meeting of a cut-ofT score on 
a general intelligence test, known as the Army General Classification 
lest, and the passing of a physical examination. Another necessary 
prerequisite for attending Officer Candiate School was that the Indi- 
vidual receive the recommendation and approval of his immediate com- 
manding officer who presumably was familiar with the work done by 
him. In addition, each man making application for attendance at 
Officer Candidate School had undergone a certain amount of basic 
training in an enlisted status. The final step in the sequence of applica- 
tion for admission to OCS was the meeting of an Officer Interview Board 
who interviewed each man individually and reviewed the various avail- 
able information on his qualifications and finally approved or rejected 
his application. Many of the criteria employed by these boards were 
subjective in nature. If approved, the individual attended an Officer 
Candidate School and took various courses, differing with the branch 
of the service. If graduated, he received a commission as a Second 
Lieutenant, and, if eliminated, he was returned to duty as an enlisted 
man. 

Selection of oficer^candidates in the British Army, An early report on 
officer selection procedures in the British Army is available in Aviation 
Psychology Abstract No. 56 (72). More recently a report has been re- 
leased describing testing procedures for officer candidates in Great 
Britain at the close of the war (22). The basic element of the selection 
system was the War Office Selection Board (W.O.S.B.) which consisted 
of a President (Colonel), a Military Testing Officer (M.T.O.) (Major or 
Captain), and a commissioned psychologist and psychiatrist. Candi- 
dates were organized into groups of approximately eight for testing 
which lasted three days. The tests were of three kinds: pencil and paper, 
practical or field, and interviews. The printed tests included an educa- 
tional and occupational questionnaire and a personal and medical one; 
three standard, general intelligence tests including mathematics and 
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figure analogy items requiring about 20 minutes each; and certain per- 
sonality tests. 

There were three practical tests: individual situations, command 
group situations, and leaderless group situations. In the case of the first 
the candidate might be asked to prepare and deliver a short talk to 
fellow candidates on group morale. In the second type of test the 
candidates took turns organizing their groups to solve an immediate, 
practical problem. The last situation was a test in which the group was 
faced with a task such as escaping across an electrically charged fence 
without the M.T.O.’s having appointed a leader. 

The procedures for scoring these various practical tests were not de- 
scribed, but were presumably subjective in nature. The difficulties in 
standardizing and objectifying such field measures in order to achieve 
adequate reliability and validity arc obvious although the attempt to 
use such techniques is noteworthy. 

A series of interviews were also employed including a psychiatric 
interview, an ‘‘officer quality” interview carried out by the Board 
President, and a technical interview for qualification in special ser^dcc 
branches performed by a specialist in the particular field. A final Board 
Conference was held to meet the candidate and discuss his potentialities 
as officer material. 

Data comparing ratings of the candidates during training with test 
scores were reported, indicating superiority of the men selected by these 
procedures over those who had entered service prior to the initiation of 
these selection techniques. Without additional information concerning 
the circumstances of the validation, particularly about the criterion of 
proficiency, it is not possible to evaluate the findings. 

Selection of officer candidates in the German Army. A detailed de- 
scription of German testing procedures for officer candidates as of 1941 
is presented in a volume issued by the Committee for National Morale 
(17). The selection techniques are reported to include a life history 
examination, expression analysis (facial, body, voice, appearance, and 
handwriting), mental capacity as measured by intelligence and interests 
tests, including projection and completion examinations, and action 
analysis. The latter consisted of two tasks, the first being the “command 
series” in which the candidate was given a scries of orders to execute; 
his behavior was observed and recorded by the testers. The second 
phase of action analysis w^as the leadership test in which a group of in- 
fantry soldiers w^ere placed under the command of the candidate who 
super vir:ed the execution of a pre-assigned task, such as the assembling 
of a prefabricated bridge. The behavior of the candidate in executing 
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this task was noted, and the men working for him were questioned. 
The similarity of these methods to those of the British is apparent. In 
addition, an apparatus test was employed involving differential reaction 
to a number of levers in response to different patterns of red, white, and 
blue lights and to lights with circles and squares which were successively 
lit. The candidate’s errors and speed of response were automatically 
recorded, and his behavior while taking the test was observed. A test 
of choice reaction to auditory stimuli with the Rupp falling rod ap- 
paratus was also employed. 

The leadership examination was conducted at an Armed Forces 
Testing Station by a board of examiners consisting of an Army Colonel, 
a medical officer, and three psychologists. Two full days were devoted 
to testing for the Army and two and one-half for the Air Forces. The 
general tests were given to groups of four and five at a time, and the 
interviews were conducted individually by the psychologists. Round- 
table discussions with the candidate were held and, during the one day 
interval interspersed between the two days of testing, the candidate 
was observed. Procedures for scoring the tests, for combining the scores 
statistically, and for follow-up studies of predictive efficiency apparent- 
ly were not reported by the German psychologists. 

Fitts (18) has recently reported the details of German selection pro- 
cedures, particularly those in the Air Force, based on interviews with 
the military psychologists involved. Several items pertinent to the 
present discussion are contained in his report. For example, he points 
out how political, practical, and scientific factors contributed to the 
breakdown of the German Aviation Psychology Program. Of consider- 
able significance in the present connection is the fact that Fitts was un- 
able to locate any validation data which appeared to meet ordinary 
scientific criteria. The German program is a beautiful example of the 
uselessness of elaborate testing techniques hand in glove with complete 
disregard of their necessary concomitants — standardization, objectifica- 
tion, and validation. 

Research proposals and studies. A number of proposals have been 
made and several researches undertaken dealing directly or indirectly 
with military leadership (see, for example, 6, 14, 15, 30, and 69.) Typi- 
cal of these suggestions and investigations are the following items. 

A group at Harvard (61) has proposed the use of a physical fitness 
test, the Harvard Step Test; an interview; and somatotype measures 
for selecting combat officers. No over-all compaHson of combined scores 
with criteria of officer efficiency was presented. 

Murray and Stein (42) have made suggestions concerning the selec- 
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tlon of combat officers. Their emphasis and philosophy appear to be 
clinical in nature, not unlike the approach of the German psychologists 
reported previously. The measures included an interview, Murray’s 
Thematic Apperception Test administered individually and to groups, 
a Construction Test similar to the British practical tests and the Ger- 
man leadership tests, and a complex perceptual-motor test taken under 
verbal stress. A group conference is held at the end of testing for a 
terminal evaluation based on test performance and clinical judgments 
of leadership characteristics. Validation data were not reported for 
the use of this test battery. 

A description of a number of potential measures for selecting combat 
leaders has recently been presented (36). These include an interview 
stressing participation in outdoor activities and a background of self- 
confidence, several perceptual-motor tests, and a printed test of military 
adaptability consisting of a series of multiple choice items in the form of 
descriptions of situations based upon actual combat events. 

The results of a number of military studies pertaining to leadership 
are not available for release at the present time, but several in the 
Army Air Forces have pointed out a number of deficiencies in the re- 
liability and validity of the criteria involved (66, 68, 71, 73). The find- 
ings of these investigations indicate a lack of agreement between inde- 
pendent judges when rating on over-all performance, and recommenda- 
tions were made in these studies for the use of ratings of work samples 
of performance in specific situations which have been directly observed 
as criteria. The difficulties of finding measures that correlate highly 
with success in Officer Candidate Schools have been reported by other 
investigators (23, 24, 74). 

A few investigations of military leadership that have been released 
are treated in the following paragraphs. Several reports of studies have 
been issued by the Information and Education Division of Army Service 
Forces concerned with morale in relation to problems of leadership in 
the U. S. Army (6S, 69). 

In the first study privates in two regiments of an infantry division 
took an attitude questionnaire during their first few weeks in the Army, 
and the results of the study were reviewed to see how the morale at- 
titudes of men who rated promotions compared with those of other 
men. The study showed that the privates who became non-commissioned 
officers some months later were likely to differ from their buddies in the 
following regards: better disciplined, more self-confident, more likely to 
feel that what they were doing in the army was worthwhile, and more 
favorable attitudes toward their commissioned and non-commissioned 
officers. 1 he statistical significance of these findings was not reported 
nor were the items on which differences were low or negative presented. 
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The fact that men who get along with their officers tend to be promoted 
is not surprising. As another part of the study, line noncoms in the two 
regiments (assumed to be the best enlisted infantrymen) were compared 
with privates and pfc*s on such factors as education, AGCT scores, 
physical characteristics and mechanical aptitude. The noncoms were 
generally found to have more education, intelligence, and mechanical 
aptitude, and to be slightly taller and heavier than the privates and 
pfe’s. In all the above cases, however, the differences between noncoms 
and privates and pfe’s were small. It was evident that the most striking 
differences between the two groups were in their morale attitude. 

Another study (69) reports a check-list of company leadership prac- 
tices. This report is based on a study of practices among 34 Army 
Service Forces companies in the continental United States. The prob- 
lem was the relationship between company leadership practices and 
company morale. Twelve selected companies answered eighteen ques- 
tions concerning leadership practices in their own outfits. Six of the 
companies were all rated high in morale by their post or battalion com- 
mander, their company officers, and their enlisted men. The six other 
companies were all rated low in morale by corresponding judges. The 
companies rated highest in morale by these judges w'ere favorably 
rated by their own men on nearly all company practices. Those rated 
lowest fared very poorly in this respect. The items on which two-thirds 
or more of the men expressed favorable opinions, in all six of the com- 
panies rated highest in morale, involved an expression of interest in the 
men by the officers. 

The Multiple-Choice Group Rorschach Test has been administered 
to 56 officers selected by unit commanders as being actually excellent 
officers (30). Their scores were compared with those of 257 officer 
candidate students enrolled in three classes. The scores for the excellent 
officers were approximately the same as those for the officer candidates. 
The difference was not statistically significant. Forty-five per cent of 
the exicllent officers and 35% of the officer candidates had four “poor 
responses” which was set as the critical score for initial screening pur- 
poses by narrower- Erickson. Results on a Health Inventory, a Psy- 
chasthenic Inventory, and a Level of Aspiration Inventory also revealed 
no significant differences between the two groups. 

Several studies dealing with military leadership were performed by 
the Applied Psychology Panel of the National Defense Research Com- 
mittee during the war (23, 24). In a preliminary study in Infantry and 
Field Artillery Officer Candidate Schools it was found that individual 
raters tended to be consistent from one week to another when members 
of each platoon rated one another. Correlations between the platoon 
members’ ratings and platoon leaders’ ratings were also high, but ap- 
peared to be spurious since the member ratings were available to the 
platoon leader before his ratings were reported. 
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In follow-up studies data were analyzed dealing with the forecasting 
of success or failure by various predictors in these same schools. It was 
concluded that none of the items obtainable before or u^on entrance to 
Officer Candidate School is highly significant of later performance; 
quality of academic work is significantly related to success or failure in 

O.C.S. ; and platoon leaders’ ratings of leadership correlate highly with 
success as would be expected in this situation, where a man is eliminated 
if his ratings by the platoon leader are consistently low. 

In another study, 176 men on combat duty were rated by their 
superior officers on a five-point scale in a situation that indicated that 
the rating officers were taking account of the actual performance and 
were not rating simply on general impressions. There was some spread 
in the ratings with 13% of the 176 officers receiving superior ratings, 
49% excellent, 23% very satisfactory, 10% satisfactory, and 5% un- 
satisfactory. These combat ratings were compared with final company 
ratings in O.C.S., with Army General Classification Test scores, and 
with age. The conclusions drawn from this study were as follows: 

1. Combat efficiency is not very closely related to ratings of leadership ob- 
tained in O.C.S. The correlation between the two sets of ratings was .15 with 
a standard error of .075. 

2. It is reasonably clear that above a certain desirable minimum, intelligence 
as measured by AGCT has little relevance to combat performance. 

3. At the present time the Infantry School is eliminating a large number of 
those who would probably be unsuccessful officers. In addition, there appeared 
to be a practically zero relationship between age and combat ratings in this 
study. 

One study that is of considerable interest in the present connection 
has recently been undertaken by the Personnel Research Section, Clas- 
sification and Replacement Branch, Adjutant General’s Office (6), in- 
volving the development of instruments for selecting officers for 
positions in the post-war Army. Certain of the instruments derived 
from this study are now being employed in selecting Regular Army 
personnel from available applicants. Five kinds of procedures were 
employed in this study. 

The first of these was the Officer Evaluation Report (OER) which was de- 
veloped as an improvement over the procedure of averaging previous efficiency 
ratings. The OER consists of five sections, the first two of which were designed 
to improve its objectivity. The third section contains five items which appear 
to measure all three of the major named rating factors — leadership, sense of 
duty, and stability which were derived on the basis oi a factor analysis of War 
Department and AAF efficiency forms. The last two sections of the OER con- 
cern recommendations for the Regular Army and for the Officer Reserve Corps 
and an ov''’*-all rating giving the officer’s standing among previously known 
officers of the same grade. 

The second instrument which was developed is the New Interview Board. 
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Its main function is to evaluate the officer’s ability to get along with people in 
line and staff functions. The interview form provides two work sheets upon 
which to record observations, reactions, and ratings on three areas called bear- 
ing or manner, voice and language, and personality traits. The manual and 
prepared work sheets attempt to emphasize the social nature of the interview 
and minimize the influence of prejudice by forbidding prior acquaintance of 
any one of the five-member board with the applicant, any prior knowledge of 
general experience or background, or any attempt to explore or evaluate any 
phases of these not brought out by discussion during the interview. 

The third primary measure was a Biographical Information Blank requiring 
the reporting of background information of a personal, educational, and occu- 
pational nature in combination with self-description of traits, attitudes, and 
habits. The blank is based upon the previous experience of the National Re- 
search Council Committee on Selection and Training of Aircraft Pilots, National 
Defense Research Committe, the Worker Analysis Section of the Occupational 
Research Program of USES, and the U. S. Navy Department. 

Two other instruments were also developed in the course of this project to 
check general learning aptitude and educational achievement: (1) an Officer 
Classification Test, a test of general intelligence, and (2) a General Survey 
Test, a measure of educational achievement. 

Approximately 15,000 officers participated in the field trials in 
about fifty Army installations. The criterion involved a nomination 
procedure as follows. Approximately 15,000 officers were brought to- 
gether in small groups of 15 to 30 who were well enough acquainted with 
each other’s work to evaluate it. Each officer was requested to prepare 
two lists of the officers in his group, those High and those Low in gen- 
eral all-around value to the Army, listing them from highest to lowest, 
next highest, next lowest, etc. Each officer was requested to designate 
the five officers of the group most closely medium or middle with 
respect to over-all value. Only officers who fell clearly into one of 
the three groups — High, Middle, or Low — in the sense of virtually 
perfect agreement with their fellow officers (never more than one dis- 
senting vole for high or low officers, and never more than two dissenting 
votes in the case of middle officers) were employed for further study. In 
addition any officer placed over one group away by the Commanding 
Officer was eliminated. A final sample of 3,000 cases was employed 
which appeared to be selected with 1,000 in the Top group, 1,000 in the 
Middle group, and 1,000 in the*Low group. 

The Biographical Information Blank and the Interview were com- 
bined with the Officer Evaluation Report by statistical weighting pro- 
cedures to secure a Combined Point Index. Pearsonian correlation 
coefficients were computed by arranging the criterion variable in the 
three categories, H, M, and L, and each predictor as a continuous 
variable. These coefficients were as follows: Officer Evaluation Report 
•60, New Interview .39, Biographical Information Blank .35, Com- 
bined Point Index .67, Previous Efficiency Report .45, and Traditional 



74 


WILLIAM O. JENKINS 


Army Board .09. It should be noted that these coefficients may have 
been increased by selecting samples of 1,000 for the Top, Middle, and 
Low groups from the total sample of 15,000. The degree to which the 
Commanding Officers’ ratings were weighted in the Officer Evaluation 
Report was not stated, but it appears likely that this factor played an 
important role. Substantial agreement between ratings by the C.O. 
and by fellow officers was to be expected. Since the OER had the highest 
validity, and the other measures when combined with it increased its 
correlation with the criterion only .07, these questions suggest the 
necessity for a further examination of the nature of the criterion here 
employed. 

Neither the Officer Classification Test nor the General Survey Test 
was related to the High, Middle, or Low criterion classification of the 
men, a finding which might have been expected since college graduation 
had been a prerequisite for most men directly commissioned and a score 
of 110 on the Army General Classification Test had been required for 
admission to O.C.S. or cadet training. The General Survey Test 
showed moderate agreement with educational level reached. A cut-off 
score on the Officer Classification Test was recommended as a first 
hurdle before the Combined Point Index is applied. With regard to the 
General Survey Test, it was stated that while no cutting score on this 
instrument was recommended, it was believed that this test might be a 
better requirement than a general educational prerequisite. 

The general conclusion drawn from this investigation was that the 
Combined Point Index will tend to select officers satisfactorily on the 
basis of past and present performance. 

A research project concerned with combat leadership has recently 
been reported from the AAF Aviation Psychology Program (15). While 
the findings of these several studies cannot be released at the present 
time, the techniques are worth mentioning briefly. The first technique 
involved the collection of anecdotes dealing with the types of behavior 
exhibited in combat by men designated as successful or unsuccessful 
leaders or by those who assumed this role in cases of emergency. These 
materials were systematized and categorized, and secondly a rating 
scale was derived which served as a check on the results of the previous 
method. The third technique involved the use of one of the more objec- 
tive of the available criteria of leadership, namely, promotions. Data 
from a number of selection devices were compared with this criterion. 

Discussion and Conclusions 

As an over-all generalization it appears that Vitcles’ ( 59 ) statement 
in 1932 concerning research on the selection of executives still holds for 
the general area of leadership: the record of accomplishment is not a 
brilliant one. No single trait or group of characteristics has been iso- 
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lated which sets off the leader from the members of his group. Several 
writers (32, 40, 41) in summarizing the literature in specific areas have 
stressed the cultural and situational determination of leadership. They 
have also pointed out the existence of wide individual differences within 
a given group as well as between groups. 

Advances in methodology in this field are definitely not striking. 
Three techniques have generally been employed : nomination of mem- 
bers of the group for positions of leadership, survey of the characteristics 
of outstanding individuals involving the use of questionnaires or pub- 
lished biographical information, and the use of selection test techniques 
for identifying leaders. Progress has not been made in the development 
of criteria of leadership behavior, nor in the setting-up of an adequate 
working definition of the concept to guide research in the isolating of 
leadership traits. 

The situation does not appear to be a particularly happy one with 
regard to the deriving of general principles or of setting up a systematic 
theory of leadership from the available information. A few statements 
may be set forth, however, that appear to hold for the findings of a 
number of the investigations reviewed; this list should be thought of as 
ii series of hypotheses for further investigation. 

1. Leadership is specific to the particular situation under investigation. 
Who becomes the leader of a given group engaging in a particular activity and 
what the leadership characteristics are in the given case are a function of the 
specific situation including the measuring instruments employed. Related to 
this conclusion is the general finding of wide variations in the characteristics 
of individuals who become leaders in similar situations, and even greater 
divergence in leadership behavior in different situations. 

2 . In practically every study reviewed leaders showed some superiority 
o\’er the members of their group in at least one of a wide variety of abilities. 
The only common factor appeared to be that leaders in a particular field need 
and tend to possess superior general or technical competence or knowledge in 
that area. General intelligence does not seem to be the answer for, as one 
writer (32) has pointed out, public leaders have ranged all the way from dull 
normal to genius. 

3. Leaders tend to exhibit ceftain characteristics in common with the 
members of their group. Two of the more obvious of these characteristics are 
interests and social background. 

4. Certain past history or background items appear to characterize leaders 
in certain activities. There is a widespread but vague hint that certain poorly 
defined personality traits characterize individuals holding positions of responsi- 
hility. It is practically impossible to evaluate these suggestions without addi- 
tional research. 

5. A number of studies suggest superiority of leaders over those in their 
2roup in physique, age, education, and socio-economic background, but the 

for further research in this connection is evident. 
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Summary 

This article reviews the findings and techniques of a number of in- 
vestigations of leadership in industry and government, the professions, 
school, and military situations. Special emphasis is given to the proce- 
dures for selecting officer personnel in the armies of the United States, 
Great Britain, and Germany; research studies of military personnel are 
stressed. Findings that appear to be common to a number of investiga- 
tions are presented as hypotheses or possible principles of leadership 
behavior. 
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COMMENT ON ‘THE VALIDITY OF PERSONALITY 
QUESTIONNAIRES’’ 

CHRISTIAN PAUL HEINLEIN 
Florida State College for Women 

In the September 1946 issue of the Psychological Bulletin,* Dr. Albert 
Ellis reviews a series of studies involving the use of personality in- 
ventories of different kinds. After considering the arguments for and 
against the various practices ofvalidating personality questionnaires, he 
proceeds to evaluate the validity of selected research results by means 
of the following verbal categories assigned to arbitrary segments of the 
familiar Pearsonian scale of correlation coefficients: 

1. Negative validity (r’s of from .00 to .19) 

2. Mainly negative validity (r’s of from .20 to .39) 

3. Questionably positive validity (r’s of from .40 to .69) 

4. Mainly positive validity (r’s of from .70 to .79) 

5. Positive validity (r’s of from .80 to 1.00) 

In arriving at this five-fold verbal classification of validity, Dr. 
Ellis remarks: 

’‘Since holding personality test evaluations to terms of Z) or E rather than 
r would probably be unfair (italics mine) at the present stage of their develop- 
ment, we shall, in this review, usually evaluate the reported coefficients of 
correlation in terms of the conventional estimations given them in the con- 
sideration of psychological and educational tests.” 

What, may I ask, could possibly be more “unfair” than to delude 
those readers who are unfamiliar with the mathematical derivation of r 
into believing that the phrases “mainly positive validity” and “positive 
validity” impart operational significance to the results of personality 
questionnaires? The five verbal categories which Dr. Ellis has coined are 
of the nature of semantic blabs, without operational referents. Arbi- 
trary ranges of correlation coefficients, which determine the limits of 
application of the five verbal classifications, can hardly be regarded as 
operational referents. The simple exchange of a verbal sign for a numeri- 
cal sign is not the discovery of an operational referent for the concept of 
validity. 

In his discussion of correlational techniques utilized for the purpose 

• Ellis, Albert. The validity of personality questionnaires. Psychol. Bull., 1946, 43, 
385-440. 
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of validating personality questionnaires, it is obvious that Dr. Ellis 
cognizes r as a trigonometric function; otherwise it is doubtful whether 
h? would have mentioned the need of converting r into E; namely, 
1 — (1 — To strengthen this view, we have his expressed conviction 
that “many correlations reported on personality- test validity experi- 
ments should not be shown in terms of r at all.** If one were in a position 
to demonstrate that an r of a given size is an unequivocal index of func- 
tional dependence between two arrays of values normally distributed, 
and if, further, one were in a position to predicate empirically by virtue 
of unambiguous knowledge of the causative conditions which operate 
to provide a specified range of effects in response, then one would need 
the assurance of an r of at least .86 in order to bring about the reduction 
of the probable error of predication to one-half the probable error of 
guessing. In spite of this fact. Dr. Ellis selects an r of .80 as the lower 
limit of his highest category of validity. Conversion of the five succes- 
sive verbal categories of validity into ranges of E provides the following 
highly questionable series: 

1. Negative validity, £’s from .00 to .01 

2. Mainly negative validity, £*s from .02 to .07 

3. Questionably positive validity, £*s from .08 to .27 

4. Mainly positive validity, £’s from .28 to .38 

5. Positive validity, £*s from .40 to 1.00 

It can readily be seen from this type of conversion — a conversion 
which Dr. Ellis favors but, paradoxically enough, regards as “unfair’* — 
that the fifth category covers an extension of value greater than that 
of the first four categories. If the above series seems questionable for 
the purpose of establishing categories of validity, then let it be remem- 
bered that the selection of the r ranges from which the E ranges have 
been derived is equally questionable. 

Unfortunately in the history of psychometrics, the coefficients r and 
E as indices of numerical comniunality have been grossly misunderstood 
and widely abused. The method of Pearsonian rectilinear correlation, 
which ignores the function of a time-axis and utilizes fixed individual 
deviations in a static matrix of paired numerical values, has been 
falsely identified with the method of concomitant variation. In the 
latter method, a controlled, isolated unit event A is varied by an ob- 
servable, measurable amount to ascertain its determining effect upon 
an observable, macroscopically constant unit event B. Only by a fantas- 
tic stretch of imagination can one attribute cause-and-effect relation- 
ship or a determining, predictive function to an r as such, irrespective of 
the size of the r. By a convenient act of extrapolation, one may choose to 
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ascribe a certain level of validity to an r of a given size. Such practice 
may be one of expediency or wishful thinking, but it certainly is not one 
of sound, scientific procedure. An adequate proof of a pragmatic kind 
is sorely needed to demonstrate the efficacy of r as an index of validity 
for a specific type of personality questionnaire or test situation. The 
arbitrary verbal classifications introduced by Dr. Ellis do not constitute 
this kind of required proof. At most, they merely add to the extensive 
confusion that exists in the literature on test-validation. 



DISCUSSION OF HEINLEIN’S COMMENT ON “THE VALIDITY 
OF PERSONALITY QUESTIONNAIRES” 

ALBERT ELLIS 
New York, City 

Heinlein’s comment on my review of The Validity of Personality 
Questionnaires contains several points with which I concur and several 
with which I must take issue. I shall consider it paragraphically. 

In his first paragraph, Heinlein states that I proceed “to evaluate the 
validity of selected research results by means of . . . categories assigned 
to arbitrary segments of the familiar Pearsonian scale of correlation 
coefficients.” There are at least three implications here that may be 
misleading : 

1. That I utilized selected research results; 

2. That I evaluated these results only in terms of correlation coefficients; 

3. That the verbal categories I assigned in evaluation of correlation coeffi- 
cients were quite arbitrary. 

Actually, the facts are these: 

1. I reported all research results that could be found in an extensive review 
of the literature. 

2. I evaluated the correlations of only those studies reporting findings in 
terms of r. Many studies reported critical ratios or other clear-cut statistical 
measures which could be accepted without any special evaluations. 

3. The verbal categories I applied to correlation coefficients were not arbi- 
trarily chosen. They have been used as a rule of thumb in the psychological 
literature for many years (e.g., Garrett, 4, p. 342), are endorsed by many 
authorities, and have been widely applied in psychology classes. 

A valid criticism which Heinlein fails to make of my verbal cate- 
gories is my inadvertently misnaming the first two of them. Strictly 
speaking, there is no such thing as “negative validity.” What I call 
“negative validity” in my article should really be termed “negligible 
validity,” and what I call “iliainly negative validity,” should be termed 
“low validity.” 

In the second paragraph of his comment, Heinlein quotes me to the 
effect that holding personality test evaluation to terms of E rather than 
r would be unfair at che present stage of their development. Unfortu- 
nately, he italicizes unfair when the correct emphasis should be on the 
present stage of their development. Throughout his comment he falsely 
emphasizes unfair — henceforth wrholly taken from its context — to make 
it appear as if I am contradicting myself. For it certainly would be a 

83 



84 


ALBERT ELLIS 


contradiction were I to say that using E is an unfair statistical device — 
and then also to say that I favor its use. What my phrasing, correctly 
emphasized, means of course is that the use of E is perfectly fair, 
statistically and strictly speaking, but that because personality ques- 
tionnaires are still (I trust) in their infancy or adolescence, taking r*s 
obtained in validity experiments and reformulating them in terms of E 
would be unduly rigorous at the present time. Consequently — in order to 
give present-day personality inventories every possible benefit of the 
doubt — I consistently evaluated the reported r’s for validity in a con- 
ventional manner — and, as Heinlein points out, in a generous manner 
indeed. Whether I should be that generous twenty-five or fifty years 
from now is highly dubious. 

Heinlein’s third paragraph is concerned with taking me to task for 
“deluding” my readers into believing that my verbal categories impart 
operational significance to the results of personality questionnaires. I 
naturally tried to do no such thing. First of all, I was not concerned with 
“the results of personality questionnaires,” but with the results of ex- 
periments attempting to validate questionnaires — which is quite a dif- 
ferent thing. Secondly, I was evaluating r’s only because they had been 
reported by investigator s^ and not because I personally believe that 
validity experiments should be analyzed in terms of r. I heartily agree 
with Heinlein that “the simple exchange of a verbal sign for a numerical 
sign is not the discovery of an operational referent for the concept of 
validity.” I must remind him, however, that since r is still being con- 
stantly used as a measure of test validity, and since evaluations of re- 
ported r's are bound to vary among different observers, some convenient 
VL-rbal labels must often be given to them by general reviewers. All I 
did in my review was to categorize them with those “semantic blabs” 
wdiich seem to be most commonly employed by discriminating psycho- 
logical writers and teachers. If this is not scientifically sound, I should 
be happy to have Heinlein suggest a better procedure. 

In paragraphs four and five, Heinlein seems to be scandalized by the 
results which would be obtained by converting r’s into £’s. While the 
facts of these paragraphs are substantially correct, his manner of relat- 
ing them raises several implications which deviate from the truth: 

1. Heinlein calls the E series ‘‘highly questionable,” so that some readers 
might infer that there is something statistically wrong with it. This is of course 
not the case. 

2. I*'e emphasizes the fact that “the fifth category covers an extension ci 
value greater than that of the first four categories.” Naturally it does. As 
Hull (7), Bingham (1), Guilford (5), Garrett (4), Conrad and Martin (3), 
Peters and Van Voorhis (9) and others have pointed out, this is the distinguish- 
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ing feature of using £. Indeed, it is precisely this ultra-conservative interpreta- 
tion which conversion into E gives to the lower degrees of r which recommends 
its used by test-makers who want to be certain that their personality ques- 
tionnaires are adequate for individual diagnosis and prediction. 

3. Heinlein again takes my term unfair out of context and — should I say 

unfairly? over-emphasizes it in a manner I never intended. 

4. The main implication of these two paragraphs seems to be that, while 
the verbal categories which I use in my article are too generous to those who 
have validated personality questionnaires in terms of r , the categories in terms of 
E (which I advocated for use at some future date) are too severe. Heinlein 
does not quite make clear, though, why the classification in terms of E is too 
severe, except to imply that it looks that way. 

In his final paragraph, Heinlein points out that it is fantastic to 
“attribute cause-and -effect relationship or a determining, predictive 
function to an r as such, irrespective of the size of the r" Here he is ap- 
parently attacking all attempts to validate personality tests in terms of 
r. Now I hold no brief for test validation in terms of r. Nor do several 
other recent writers. Jackson and Ferguson (8), for example, advocate 
the use of analysis of variance for the calculation of test reliability and 
validity. Guilford (6) has recommended factor analysis. Brogden (2), 
on the other hand, has recently published a paper in which he sub- 
stantially upholds the value of r as a direct measure of predictive effi- 
ciency, and contends that conversion into E is quite unnecessary. If 
Heinlein has still a different advocacy in this connection, I should be 
interested in hearing it. The main point is that in my article I merely 
noted that a good many experimenters — whether or not I, Heinlein, or 
anyone else likes the fact — did report test validations in terms of r. I was 
concerned, therefore, only with evaluating, or giving a convenient label 
to, these already reported r’s. I wholly agree with Heinlein that “an 
adequate proof of a pragmatic kind is sorely needed to demonstrate the 
efficiency of r as an index of validity for a specific type of personality 
questionnaire or test situation.” My verbal classifications were certainly 
not designed to constitute that kind of required proof. Nor were they in 
the least intended to settle, once and for all, the many ticklish problems 
of measuring questionnaire validity. I am rather surprised that Hein- 
Icin should have ever thought this to be the case. 

In sum, Hcinlein’s comment on my article on The Validity of Per- 
sonality Questionnaires does not really call into question the general 
content or conclusions of my study but appears to boil down to (1) a 
questioning of the efficac)'^ of r as an index of test validity, and (2) an 
attacTc upon the specific verbal categories that I attached to r’s actually 
obtained by test researchers. On the first of these points, I am inclined 
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largely to agree with his viewpoint and to welcome its airing. On the 
second one, I can only repeat that some verbal classifications of r’s must 
often be made for review purposes; that the ones I employed were 
based on long-term psychological usage ; and that if Heinlein has any 
better suggestions to make, I shall certainly welcome them. 
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Masserman, Jules H. Principles of dynamic psychiatry, Philadelphia: ^ 

W. B. Saunders, 1946. Pp. xix+322. 

In 1943, J. H. Masserman published, under the title Behavior and 
Neuroses with the University of Chicago Press, an account of a long 
series of experiments which he had been conducting on cats. The ex- 
periments were designed to illustrate and elucidate behavior described 
largely under psychoanalytical categories. Masserman’s new book. 
The Principles of Dynamic Psychiatry, extends the range of application 
of these experiments and undertakes a very ambitious program of 
formulation of principles and corollaries. 

Masserman’s system derives from the author’s beginnings as a 
psychoanalyst and from an exposure to the Gestalt tradition plus wide 
reading in the field of general psychological theory, as well as his highly 
original line of experimentation. The Principles of Dynamic Psychiatry 
is divided into two parts of which the first (six chapters) covers the de- 
velopment of behavior theory in academic psychology and psycho- 
analysis. The second part (eight chapters) develops the author’s 
system of biodynamics. He states his conception of the principles of 
behavior and finds illustrations from his experimental and clinical 
records. Masserman finds that there are four general principles. 

1. Principle I reads: ** Behavior is actuated by the physiologic needs of the 
organism and is directed toward the satisfaction of those needs.” This is of 
course a very explicit adoption of the teleological point of view in its most naive 
and direct form. Masserman has no interest in investigating how a need can 
actuate behavior. He deals with end results and not means. This first teleo- 
logical principle is supplemented by three others. 

2. Principle II (experiential interpretation and adaptation) is: “Behavior 
is contingent upon and adaptive to the organism’s interpretation of its total 
milieu, as based on its capacities and previous experiences.” 

3. Principle III (deviation and substitution) reads: “Behavior patterns 
become deviated and fragmented under stress and, when further frustrated, 
tend toward substitutive satisfactions.” 

4. The fourth principle (conflict) is: “When in a given milieu two or more 
motivations come into conflict in the sense that their accustomed consum- 
matory patterns become incompatible, kinetic tension (anxiety) mounts and 
behavior becomes hesitant, vacillating, erratic, and poorly adaptive (neurotic) 
or excessively substitutive, symbolic, and regressive (psychotic).” (The state- 
ment of this fourth principle is a fair sample of the author’s style.) 

The four principles are each supplemented by numbers of corollaries. 
As may be inferred from the rather sketchy and loose statements of the 
principles given above, the corollaries consist of exceptions, relevant 
general remarks, or of elaborations of the principles. They are distinctly 
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not corollaries in the sense in which that word is used by mathematicians 
or logicians. 

In an appendix, twenty-one pages are devoted to a detailed account 
of an illustrative psychoanalysis of a neurotic personality. Three other 
appendices include a short article on psychoanalytic formulations of 
the psychoses, an article on illustrative motion picture films prepared by 
the author and his associates, and a reprinting of an article on propa- 
ganda written by Masserman in 1942. Thirty pages are devoted to 
bibliography and forty to an extended glossary of psychiatric terms. 

"Psychoanalytic theory today," Masserman says, "... is far from 
an accepted body of dogma; on the contrary, a great deal of it is fluid, 
ambiguous, unintegrated, and exceedingly polemic" [page 93]. Masser- 
man’s book should serve to reduce this confusion and advance the 
science of behavior. The interest of psychologists in the book will prob- 
ably be directed less toward the principles and corollaries which do not 
bear very close scrutiny than toward the experiments with cats which 
are highly original in conception. Results of the cat experiments are 
interpreted as illustrations of principles and corollaries. 

These experiments in general involved a cage, a means of administer- 
ing punishment in the form of an air blast or shock and neutral stimuli 
and food rewards. They cover the simple acquisition of substitutive 
goals, fixation on part patterns, dominance, frustration, aggression, 
regression to earlier solutions, conflict-generating discriminations, ex- 
perimental neurosis, and the effects on conflict of removal from the 
situation, forced solutions of conflict relief by re-training, inter-animal 
influence on neurosis, spontaneous working through to resolution of 
conflict, and the effects of drugs, particularly alcohol, on the behavior of 
animals in conflict situations. This is a rich experimental offering. 

Edwin R. Guthrif. 

University of Washington. 

Rogers, C. R., & Wallen,}. L. Counseling with returned servicemen. 

New York: McGraw-Hill, 1946. Pp. vii-|-159. 

Although the words "servicemen" and "veteran" are frequently 
used and the cases illustrated are drawn from among returning service- 
men, this book makes no attempt at a specific analysis of veterans’ 
problems. Instead, it attempts to place such problems in the general 
framework of broad personality maladjustments to which the technique 
of nondirective counseling may be applied. 

As a general introduction to the technique of nondirective counsel- 
ing, the book is excellent. It is simply and clearly written, and the ex- 
position of nondirective methods is orderly and reinforced by a wealth 
of carefully chosen quotations from actual counseling interviews. The 
last chapter contains practice exercises for would-be counselors, and 
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there is a selected bibliography of 23 titles in the appendix. Since the 
authors apparently envisage the practicing of nondirective counseling 
by readers whose instruction may be limited to the reading of this book 
(“Where skilled supervisory criticism is unobtainable definite profit may 
be obtained by a group of counselors studying together and analyzing 
and criticizing each others’ interviews”), one wishes that the treatment 
of nondirective counseling were not so completely positive and en- 
thusiastic, and that some mention had been made of negative instances 
or of the possible limitations of the nondirective technique in handling 
some types of problem. 

William A. Hunt. 

Northwestern University. 

Johnson, Wendell. People in quandaries. New York: Harper, 1946. 

Pp. xiv-f 532. 

In this book Dr. Johnson applies general semantics to the problems 
of personal and social adjustment- His thesis is (p. 45) ”... that science 
clearly understood, can be used from moment to moment in everyday 
life, and that it provides a sound basis for warmly human and efficient 
living.” Maladjustment, in Dr. Johnson’s opinion, is essentially the 
result of Idealism which leads to Frustration and finally, to Demoraliza- 
tion. This IFD sequence, together with the myriad evaluations, labels, 
and abstractions which stem from it, leads to the conditions out of which 
many serious disorders of behavior develop. 

The author presents a convincing case for his viewpoint, and he 
presents it in a clear and entertaining style rarely found in books on this 
subject. He spices his pages with a large number of examples which are 
both pointed and sprightly. For example, when discussing extensional 
devices, the author quotes a wag who reputedly said that there are two 
kinds of people in the world: those who always divide the people of the 
world into two groups and those who don’t. 

Most semanticists seem prone occasionally to yank Aristotle down 
from his niche in the halls of time and give him a thorough drubbing. 
Also, they like to indulge a similar mass action against the philosophers 
of any period. In this. Dr. Johnson is no exception. However, he de- 
votes but a few pages to dispatching Aristotle and Aristotelian logic, 
and his potshots at philosophers are few and mild. In contrast, he treats 
the members of the medical profession with exaggerated respect, almost 
with awe. He might have presented some excellent examples of how the 
members of some medical groups created problems of a semantic nature 
by hanging labels as psychoneurotic or NP on many thousands of service- 
men. 

A minor fault in the book is found in the author’s occasional tendency 
to cast aside logical restraint when elaborating a point. On page 280 he 
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writes, “Einstein has never performed a laboratory experiment; that 
may well be a major part of the reason for his tremendous scientific 
achievements. “ 

But such lapses are quite rare. The book represents a significant con- 
tribution in an area which has hitherto been inadequately treated. 
While it is really a book for everyone, it has a special appeal for those 
who deal professionally with problems of adjustment. Clinical psycholo- 
gists will find particular value in the chapter “In Other People’s 
Quandaries” and in the semantic exercises and research discussion at 
the end of the book. As a text or as reference reading. People in Quan- 
daries should fit in well with psychology courses which deal with be- 
havior problems. 

Irwin August Berg. 

University of Illinois, 

Carp, Bernard. A Study of the influence of certain personal factors on a 

speech judgment. New Rochelle, New York: The Little Print, 1945. 

Pp. viii + 122. 

The field of speech and speech education has expanded rapidly in 
recent years. In the selection of teachers, radio announcers and speakers, 
stage and screen performers, telephone operators, etc., as well as in 
placement work in the schools, the need for improved methods of speech 
testing has become acute. 

This study attempts to determine whether audible speech can be 
tested and rated objectively. It also describes in detail the methods used 
in constructing, administering, scoring and weighting an original battery 
of speech tests called Speech Appraisal Forms. 

Speech tests were administered to 25 male college seniors and gradu- 
ate students whose test performances were rated on Speech Appiaisal 
Forms by 24 specially trained judges. The resulting data were treated 
statistically by means of the Latin Square modification of the “Analysis 
of Variance” technique. The author recognizes the limitations of this 
method and recommends further experimentation and more precise pro- 
cedures. He concludes, however, that such factors as dress, appearance, 
poise, etc., “need not significantly affect a judge’s rating of audible 
speech.” 

This work should be of particular interest to those concerned with 
speech testing of either children or adults. It will be of value, also, to 
those interested in conducting further research in this important field. 

St. Clair A. Switzkr. 

Miami University, 

Morg an, John. J. B. How to keep a sound mind, {KQ\,EA,oi Keeping a 

sound mind.) New York: Macmillan, 1946. Pp. vii-f404. 

The point of view expressed in this book is identical writh that of the 
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first (1934) edition, i.e., that mental health depends upon the develop- 
ment of correct mental habits. Ten of the 14 chapters are essentially the 
same as ones in the older book, although the titles are not always the 
same. The materials have been rearranged and improved, new sections 
have been added and others deleted. The chapters on crime, self- 
confidence, exaggeration of defects, and how to strive toward desirable 
goals have been replaced with chapters dealing with a wholesome pat- 
tern of living, developing social poise and security, and “how to be 
happily maladjusted.’* Some of the materials in the omitted chapters 
appear at various places in the new book. 

The brief bibliography at the end of the earlier edition has been re- 
moved and a list of references included for each chapter. Considerably 
fewer than half of the titles, however, bear dates later than 1934. The 
number of study questions has been greatly reduced. There are no 
tables, graphs, or illustrations. 

The style is interesting, with many case studies and other valuable 
anecdotal material. There is, however, a lack of documentation and of 
references to experimental studies. Occasionally statements are made 
which arc questionable and should be supported by data. For example 
(p. 230), “The probabilities that a man will remain in an occupation 
depend (according to statistical studies) much more on the financial and 
domestic responsibilities which he must meet than upon his success in or 
fitness for his job.** No evidence is cited. 

The book, offered as a basal textbook for college classes in mental 
hygiene, attempts to “ . . . [put] in understandable form the basic 
principles involved in the preservation of one’s own mental health.*’ 
This purpose is well achieved, although the book perhaps will be more 
UsSeful as supplementary material than as a basic text in mental hygiene, 
especially with students who have had previous work in psychology. 

Claude M. Dillinger. 

Illinois State Normal University, 

Stern, Edith M. The attendant's guide. New York: The Common- 
wealth Fund, 1945. Pp. xiv+104. 

Recent “revelations’* of alleged mistreatment of psychopathic hospi- 
tal patients has naturally raised questions about, and brought charges 
against, attendants employed in such institutions. Stern’s little book is 
a timely instrument for improving the training of attendants. 

Divided into three parts, it treats respectively of procedures de- 
manded by patients* general characteristics, particular kinds of patients, 
and of the attendant and his job. The first part has to do with hospital 
routine. Part Two classifies patients by dominant symptoms (from the 
view point of ward care), for example, patients “who do too much,*’ are 
“not fit to be seen,’* “convalescents,** and other similar groups. The 
final part is a frank talk about the attendant’s responsibilities and his 
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present limited prestige status. The advantages of the occupation are 
listed as valuable experience prefatory to more advanced levels of train- 
ing, and being in on the ground floor of a coming skilled craft. 

Throughout the style is direct and non-technical. Chapters are 
replete with specific suggestions for meeting practical problems. Mrs. 
Stern’s book will make a definitely favorable contribution to the goal it 
professes, namely the improvement of attendants* skill. 

Stanley S. Marzolf. 

Illinois State Normal University. 

Huntington, Ellsworth. Mainsprings of civilization. New York: 

John Wiley, 1945. Pp. xli + 660. 

Professor Huntington’s general thesis concerning weather and 
climate in relation to human affairs is widely known through his many 
publications. This book attempts an integration of much of his work to 
date. 

The mainsprings are: A “basic evolutionary force,’’ the innate 
qualities of the human organism, the physical environment, and the 
social environment or culture. Although discussed more specifically in 
the author’s earlier writings, these basic concepts in the present context 
are not concisely expressed. For example, the “great evolutionary force 
which permeates all nature” is implicitly accepted as a kind of push from 
the rear operating in the history of life. Although told that “for thou- 
sands of years civilization has been advancing along certain definite 
lines,” the reader is left to infer from context that these “definite lines” 
probably have to do with the development of larger and larger single 
units of social organization and with the “betterment” of living gen- 
erally. Again, the innate qualities of the organism arc referred to in 
very general terms, sometimes almost as a primary good, insuring 
human progress. 

The writing is enthusiastic and vigorous and will undoubtedly appeal 
to those who seek to establish correlations among the several disciplines 
addressed to the study of human behavior. The book will be read with 
some skepticism by those who are well acquainted with the “ifs,” “ands” 
and “buts” pertaining to the facts established within any one discipline. 

Professor Huntington disavows the theory of innate race differences; 
he states that all peoples or stocks contain qualities which will enable 
them to adapt and to develop culturally, but that these qualities develop 
or fail to develop depending upon the presence or absence of appropriate 
environmental conditions. An ingenious comparison of Newfoundland 
and Icelandic cultures illustrates this interplay of organic quality and 
environing circumstances. 

Social psychologists will be interested in the treatment of character 
and inheritance, the place of “kiths” (“a group of people relatively 



BOOK REVIEWS 


93 


homogeneous in language and culture and freely intermarrying with one 
another”) in history, and the significance of selective migration and 
environment in the development of kiths. Likewise, the consideration of 
weather in relation to mental activity and behavior, diet and national 
character, and cycles in human activity deserve attention. 

Psychologists will find Huntington’s criterion of mental activity 
somewhat limited: “A good measure of intellectual activity on a large 
scale is the circulation of books by libraries, especially ordinary city 
libraries. People read serious books more frequently when their minds 
arc active than when they are inert” (p. 344). Armed with library 
circulation figures from a number of American cities, the author demon- 
strates a seasonal and a weather linkage with reading and advances a 
theory of heightened mental activity in relation to increased atmos- 
pheric ozone. 

In only a few instances, chiefly citations from literature, are correla- 
tion coefficients presented. At one point (p. 230), the author refers to r* 
as giving the ”per cent of the resemblance between two variables due to 
a common cause”; it might be more accurate to speak of “common 
elements.” This apparent lack of distinction between concomitance and 
causation appears at a number of points throughout the book. Causal 
relationships are freely deduced or inferred from observed associations. 
In practically no instance is the degree of the association established 
in terms of a correlation coefficient, or are the observed associations sub- 
mitted to a test of significance. 

The author accepts the concepts of multiple causation and of 
multiple correlation, but he experiences great difficulty in applying 
them to his data by verbal, descriptive methods. There is an almost 
complete absence of the concept of reliability of a statistic. When per- 
centages are compared, differences which fit a “trend,” no matter how 
slight, are accepted as “real.” Occasionally, the author will advance 
rational explanations for slight reversals in a trend. 

The chief problem which psychologists will have in evaluating this 
work is that of the quite different framework within which the author 
operates. His interest in human behavior is oriented toward man in the 
mass rather than toward man, the individual. He concentrates on trend 
rather than on variation. -Huntington’s procedures probably do reveal 
facts which analysis, by segmenting data, tends to overlook. Roughly 
analogous situations occur in the relation of range of measurement to 
magnitude of correlation, and in the use of the correlation coefficient 
for actuarial prediction as contrasted with prediction of the single case. 

But in spite of the author’s apparent unfamiliarity with psychologi- 
cal methods and findings, this study has considerable value for psycholo- 
gists and their work. It reveals a body of material of undoubted 
significance for human life which can greatly complicate the psycholo- 
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gist’s effort to account for variance in, and to predict the course of be- 
havior. It affords a mass of ideas which could well command the re- 
search energies of a corps of social psychologists for a lifetime, trans- 
lating the ideas into hypotheses capable of experimental test. 

Dale B. Harris. 

University of Minnesota. 

Smith, Bruce Lannes, Lasswell, Harold D., & Casey, Ralph D. 

Propaganda, communication, and public opinion; a comprehensive ref- 
erence guide. Princeton; Princeton Univ. Press, 1946. Pp. v+435. 

This book represents a continuation of Propaganda and promotional 
activities; an annotated bibliography, also written by the present authors, 
and published in 1935. It lists titles of and briefly comments on nearly 
3,000 selected books, periodicals, and articles, most of which appeared 
between mid-1934 and March, 1943. In addition to this annotated 
bibliography, which fills approximately two-thirds of the book, there 
are certain other features of general interest. 

Four chapters at the beginning cover important background mate- 
rial: channels of communication, political communication specialists, 
description of contents of communications, and description of effects of 
communications. In his section on communication channels, Casey 
briefly reviews the historical development of American communications 
as influenced by the rise of democracy, industrialization, and urbaniza- 
tion. The economic aspects of communication are discussed, with 
special reference to costs of and relationships between radio broadcast- 
ing and newspaper publishing. 

Smith follows with a discussion of the major political communica- 
tion experts of our times. Information is given in tabular form concern- 
ing the heads and propaganda ministers during World War II of some 
of the larger nations. Their occupational origins are noted and data are 
presented concerning their fathers, and their own careers. Further 
tables give such facts about them as estimated incomes of fathers, 
childhood exposure to authoritative symbols of society, formal educa- 
tion, socio-economic status during first decade of employment, and 
ages. 

In the third and fourth introductory essays Lasswell attacks the 
problems of describing the contents and determining the effects of com- 
munications. Analysis of contents can be done by a listing and com- 
parison of themes, by counting of socially optimistic and pessimistic 
expressions in speeches, and noting values expressed in motion picture 
characterizations. It is desirable to base content analysis on insight into 
subject response. Effects may be judged by votes in free elections, ac- 
tions by public officials, by people in direct contact with the movements 
involved, etc. 
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The bibliography itself is arranged in sections covering (1) strategy 
and technique, (2) promoting groups, (3) response to be elicited, (4) 
symbols utilized, (5) channels, (6) measurement, and (7) control and 
censorship. Entries are arranged alphabetically by titles within sections. 
One hundred and fifty outstanding titles are starred in the listings and 
commented on relatively more extensively. This sectioning and 
alphabetization is supplemented by a comprehensive combined subject 
and author index. 

As in the case of many books by several authors, the individual sec- 
tions are not too well integrated, although in themselves interesting and 
informative. Annotations do not seem to follow any particularly sys- 
tematic pattern, covering in various combinations the actual contents, 
their significance, their value, and the backgrounds of the authors. This 
approach sometimes leads to weak notes, but on the whole the annota- 
tion is quite adequate for most purposes. The 42-page author and sub- 
ject index should add greatly to the usefulness of the bibliography. 

This book is a reference work of great obvious value, indispensable 
to any adequate library. 

R. B. Ammons. 

University of Denver. 

Dewey, John. Problems of men. New York: Philosophical Library, 

1946. Pp. iv+424. 

Except for a prefatory note and an introduction of 18 pages, this 
book consists wholly of a reprinting of articles published during the 
past 12 years in 14 different journals and magazines, together with one 
paper originally published, as stated in the prefatory note, “half a 
century ago” (exact time and place not indicated in the book, but ac- 
tually in Decennial Publications of the University of Chicago^ 1903). 

A reviewer in The New York Times (June 9, 1946) has called this “a 
mighty book.” This may be true in the sense that its author is a mighty 
man of influence, through more than half a century, on psychology, 
education, and philosophy. His influence was felt earliest on psychology 
(functionalism) through his 1886 book, Psychology^ and through his 
1896 article, “The Reflex Arc Concept in Psychology.” Since the turn 
of the century his influence has been greatest on educational theory and 
practice (“progressive education”) and on philosophy (instrumentalism). 
Whatever book he writes will command the respect of thoughtful 
readers. What E. G. Boring wrote of Dewey in 1929 is equally true 
today, — “While he has had in the last twenty-five years little effect 
upon psychology proper, he has led a very influential life in its effects 
upon- intellectual America, expounding repeatedly the problems of 
human nature” {A History of Experimental Psychology, 542). 

Except for the magic of the author’s name, however, this book is not 
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great as a hook, since it was originally written, not as a book, but as in- 
dividual articles, each complete in itself, for a wide range of journals and 
magazines from The Journal of Philosophy to The Rotarian. Few readers 
except reviewers and devotees of Dewey’s philosophy in its entirety will 
have the persistence to read the whole book. The chapters on education 
will be of interest to the general reader, dealing as they do with a 
vigorous defense of democracy and the experimental method against 
all forms of external authority, whether these are based on classical 
tradition, medieval supernaturalism, or present-day demands for **a 
moratorium on science.” On the other hand, all the chapters of Part III, 
except one, are from The Journal of Philosophy ; and they consist largely 
of technical philosophical arguments of interest mostly to other techni- 
cal philosophers. The uncut pages typically found in numbers of that 
journal in the average college library bear witness to this statement. • 

One reviewer has spoken of Dewey’s “inimitably cluttered prose” 
{Time, June 26, 1946). While this is a true enough characterization in 
general, there are instances in this book of chapters, like the one on 
James Marsh and the first of the two on William James, so well written 
as to make applicable to Dewey himself what he says of Bertrand Rus- 
sell, — “His lucidity and felicity of expression are ever the despair of 
lesser writers, and in . . . [these chapters] he has almost surpassed him- 
self” (p. 171). 

Wesley R. Wells. 

Syracuse University, 

Rand, W., Sweeny, M. E., & Vincent, E. L. Growth and development of 

the young child. Philadelphia: W. B. Saunders, 1946. Pp. vii+481. 

The fourth edition of this elementary text of child growth and de- 
velopment appears six years after the third revision. The subject matter 
is almost completely reorganized, but much of the old content remains 
essentially unchanged. The new edition is arranged in topical sections — 
physical, intellectual, social and emotional, etc. — whereas, the earlier 
edition considered all of these topics in sections devoted to different 
chronological periods of development. While the more recent treatment 
provides a more logical and continuous approach to child development, 
it may make the book more cumbersome and inconvenient for parents 
who wish to read as their children develop. 

The sections of this new edition devoted to emotional, intellectual 
and social development have been expanded and a chapter on current 
concepts of growth and development has been added. The chapter on 
current concepts reflects a significant trend in child care — a trend away 
from V'^atson’s dicta toward the contemporary philosophy of self- 
regulation of diet and development as proposed in recent publications 
by Gesell and Ilg, Ribble, the Aldriches, et al. An example of this shift 
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in emphasis is the authors’ discussion of thumbsucking. In the third 
edition they state, “Mechanical restraints are sometimes helpful if used 
before the habit has become so important to the child, or if used with 
older children as a reminder when they themselves have decided to 
conquer the habit.” In the recent revision one finds “The older recom- 
mendations for the use of mechanical restraints, of rewards and punish- 
ments and other adult imposed devices are now considered not only 
poor practice, but in all probability risky to the future well-being of the 
child.” References to the recent publications of Gesell and Ilg, and 
Ribble are copious. 

Although the new edition contains 417 references, as contrasted with 
only 221 in the third edition, one still finds many aspects of psycho- 
logical development ignored, sketchily treated, or inadequately sup- 
ported with available, published research data. In some cases, definite 
rules are laid down for child guidance without any reference to research 
findings. This, in many cases, appears to be an effort to extend a hand of 
aid and comfort to parents who want something definite — correct or in- 
correct. 

Since this revision retains much of. the previously published material 
and includes a considerable amount of recent research, it should be ac- 
cepted and appreciated by those who have found the earlier editions 
useful. In the reviewer’s opinion this edition should be a helpful source 
book for a highly selected group of parents and a better-than-average 
textbook for an elementary course in child care and training. 

George G. Thompson. 

Syracuse University. 

Beaumont, Henry. The psychology oJ personnel. New York: Longmans, 

Green, 1945. Pp. xm+306. 

Beaumont, Henry. Psychology applied to personnel. New York: 

Ltuigmans, Green, 1946. Pp. viii + 167. 

The Psychology of Personnel is intended for employers and as a text 
in a general course for college students. The level of the text is such that 
a previous course in psychology would not be necessary as a prerequi- 
site. Psychology Applied to Personnel is a work-book to accompany the 
text. 

The Psychology of Persoftnel contains the following 11 chapters: 1. 
Understanding Employees, II. Analyzing Jobs, III. Selecting Em- 
ployees, IV. Training Employees, V. Working Conditions, VI. The 
Workers* Health, VII. Promoting Safety, VIII. Supervision, IX. Merit 
Rating, X. Providing Incentives, XI. Occupational Adjustment. Cita- 
tions from 59 industrial and other organizations, which are an impor- 
tant contribution of the book, give examples of personnel procedures. 
Thirteen organizations are cited with reference only to Training Em- 
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ployees. There is no author index, and in fact no author or almost none 
is cited by name in the entire book. There are no tables, no figures, and 
practically no psychological statistics. This type of presentation of the 
material which omits the names of investigators and statistics will prob- 
ably appeal to the average reader. 

As is somewhat implied by its title, the treatment of The Psychology 
of Personnel is intermediate between books on industrial psychology and 
personnel administration. The absence of statistical results and freedom 
from detailed reports of investigations makes it appear closer to the 
usual book on personnel administration than to the usual book on in- 
dustrial psychology. 

Beaumont has succeeded in writing a general book that is probably 
simple enough not to intimidate anyone. Almost all of the available 
conclusions of psychological applications to personnel arc presented. 
On the other hand there is little perspective shown in pointing out major 
emphases. There are an unusually large number of topics included for a 
book of this size. Besides the topics expected from the chapter headings 
there are a good many surprising ones as for example a brief discussion 
of Planned Parenthood (p. 165). The simplicity of the book occasionally 
spills over into elaborating the obvious as “In all plants, offices, and 
stores where women are employed there is a need for separate sanitary 
facilities.” And with reference to evaluating experience, an entire page 
is devoted to pointing out that whether the experience of an auto 
mechanic in Detroit will be indicative of success in Tulsa depends on 
level of performance, pay level, hours of work, location of plant, and job 
environment (p. 60). 

In addition to presenting a great many examples of personnel prac- 
tices in American industry and business, considerable emphasis is 
placed on Army practices and on the problems of servicemen returning 
to civilian employment. 

The two fullest topics are those of training and selecting supervisors 
by tryout. 

Psychology Applied to Personnel is made up of two parts. Part I: 
Personnel Statistics, and Part II: Notes, References, Questions, and 
Applications. 

Part I : Personnel Statistics, gives the method and blank work sheets 
for calculating frequency distribution, measures of central tendency, 
range, standard deviation, group and individual comparisons, sig- 
nificance of traits, and product-moment correlation. A set of actual 
data on 94 cases from a company is given. These data are used in the 
illustrative problems and are to be used in assignments. These assign- 
ments are interesting. Uncharacteristic but nevertheless unfortunate is 
a slip ill explaining correlations (p. 36). The explanation assumes that 
the relative importance of coefficients of correlation is of the same 
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magnitude as the size of the coefficients rather than the square of the 
coefficients. This error is not cleared up under The significance of cor- 
relation^ when referring to a coefficient of —.09, “ . . . this relationship 
is not a close one so that there will be many exceptions to this general 
rule.*' Wonderful understatement! 

Part II of Psychology Applied to Personnel has chapter headings 
identical with those of The Psychology of Personnel. Twenty-five good 
true-false questions are presented for each of the 11 chapters of the text. 

A large number of up-to-date references are given with respect to 
each major topic of each chapter. These references are from a variety 
of sources, industrial magazines, personnel journals, advertising or semi- 
advertising publications by companies, government publications, and 
the psychological literature. The references should be particularly use- 
ful to teachers. For the average employer or other student there are too 
many references and insufficient annotation to be of greatest usefulness. 
A deficiency with respect to a number of references to government 
sources is that the references are not satisfactorily definite, as for ex- 
ample, “The Women’s Bureau of the United States Department of 
Labor has available several publications dealing with proper provisions 
for the employment of women in industry” (76). 

All in all these two books by Beaumont are a definite contribution in 
presenting personnel psychology at the intended elementary level. 

Thomas W. Harrell. 

University of Illinois. 

Maier, Norman R. F. Psychology in industry. New York: Houghton 
Mifflin, 1946. Pp. xvi+463. 

Reports during the last decade describing studies of the effect of the 
social milieu in industry are reflected in this new Psychology in In- 
dustry. Although most of the topics covered are those found in the con- 
ventional industrial psychology text, the emphasis and amount of 
space allotted to each topic is considerably different. While this book is 
not a manual of advanced techniques, the discussion is by no means 
shallow. 

About one-quarter of the book considers general principles and 
causes of behavior, and the development of attitudes and morale. The 
problems of the worker on the job and his relationships to it, his fellow 
workers, and to management are discussed in detail. Maier feels that 
one must understand human behavior as it exists rather than to blame 
and punish. Man's behavior is a product of multiple forces and thes». 
must be changed before behavior changes. “Causation implies that a 
given individual in a given situation must do as he does.” Management 
should seek causes rather than treat symptoms. Reference to the 
Western Electric studies shows the effect of social groupings, attitudes, 



100 


BOOK REVIEWS 


and morale upon performance. Since the background for the develop, 
ment of attitudes in the cases of management and labor is different, and 
since there is no ‘ 'right,' ' mutual understanding of the other's point of 
view is necessary. 

The section on individual differences, ratings, and the use of tests is 
elementary and general. Where other texts become expansive and 
specific, Maier deliberately curtails. Practically no specific tests are 
mentioned although the reader is referred to other sources. This section 
of the book is the weakest and most poorly organized. Even a non- 
technical review should emphasize the extreme importance of criteria. 
The consideration of job analysis (and its relationship to selection of 
adequate criteria) should precede material on selective and evaluative 
instruments. 

Fatigue, time-motion analysis, accidents, and the working environ- 
ment (illumination, atmosphere, noise) receive adequate attention. The 
author comments on the techniques employed by the industrial engi- 
neer, but his emphasis is on changes in the behavior of workers brought 
about by improved conditions. Training and motivation get standard 
treatment. Perhaps the topic of motivation should have been included 
in the earlier chapters concerned with causation in behavior. Labor 
turnover is interpreted as a symptom of dissatisfaction and manage- 
ment is advised to ferret out its causes. The final chapter reviews 
briefly, but perhaps too repetitiously, the material covered in the pre- 
ceding chapters with special emphasis upon its use by the supervisor, 
the industrial counselor, and higher management. 

A number of errors or questionable statements can be found, but 
probably not more than tend to creep into any text. For example, there 
is an implication that any test that will correlate .30 with a criterion is 
good, and while one grants that predictions will be better than chance, 
the implications of such a low validity coefficient should have been 
clarified. In fact, the author once implies that any test is better than no 
test. In another place one finds the statement, “Intelligence tests . . . 
are aptitude tests in the sense that education and experience have little 
or no effect on the score." One statement says that jobs requiring 
dexterity and mechanical operations show no relationship with intel- 
ligence, but the following sentence indicates that lower levels of in- 
telligence are sometimes more satisfactory on these jobs. In reference 
to time-motion economy, the view that, “The right and left halves of a 
man's body are mirror images of each other," disregards the functional 
inequality of opposite sides of the body. 

The physical appearance of the book is good but appropriate il- 
lustrations might make for higher interest, particularly in view of its 
intended audience. The few references made are found at the bottoms 
of pages which allows easy reading. A bibliography by chapters is found 
in the appendix. 
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If used as a text, supplementary readings would be desirable. Cer- 
tainly space allotments do not actually reflect the amount of experi- 
mental information available in each area. However, the change in 
emphasis toward the social and individual aspects of work and the 
variety of applications suggested make refreshing and interesting read- 
ing for the student of industrial relations. 

Lester P. Guest. 

The Pennsylvania State College. 

Harriman, P. L. (Ed.) Twentieth century psychology. New York: 

Philosophical Library, 1946. Pp. xiii+712. 

Phis book does not fulflll the expectations implied by its attractive 
title. The reader who expects a survey of contemporary psychology will 
be disappointed. It is a collection of thirty-nine articles by an assort- 
ment of writers, most of them well-known to psychologists, and some of 
them heavyweight authorities. But twenty-five of the articles are re- 
prints from current periodicals, the majority easily accessible to profes- 
sional psychologists. Periodicals represented more than once include the 
Journal of Social Psychology and the Journal of Abnormal and Social 
Psychology, with four articles eacli, the Journal of General Psychology 
with three, and the Psychological Review with two and the larger part of 
another. Thus, many of the contributions will not be new to the profes- 
sion. 

According to the editor, the volume was prepared for the general 
reader, but it is uncertain what general reader he had in mind. The 
psychologically untrained reader will find the technical jargon too much 
for him. The student coming up in the field will find much better 
volumes for parallel reading in his course work, until he is ready to go 
directly to the periodical literature. About the only general reader I can 
think of for whom the volume might have been designed is the specialist 
in related fields of knowledge, e.g., philosophy, the social and natural 
sciences, etc. Such a reader is apt to get as much impression of what is 
wrong with psychology as of what psychologists are contributing of use 
and value to him. 

One thing which would appear to be wrong is that psychologists 
seem to write with a heaviness of touch that is well-nigh depressing. Let 
me cite only two of many possible examples. “How may differences in 
subgroup structure, group stratification, and potency of ego-centered 
and group-centered goals be utilized as criteria for predicting the social 
resultants of different group atmospheres?’* ask Lewin, et aL (200). 
“We may say that the disturbance of the tentatively established move- 
ment-stereotypy at this critical point disordered the manner in which 
the focal stimulus-patterns were encountered just before blocking oc- 
curred in the given blind, thereby interfering with expansion of the 
nuclear blind-conditioning process in the segment,** says Schnierla 
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(291-292). The general reader who wishes to know more about what is 
being discussed in passages like these may refer to previous publications 
of the writers. He may, in short, begin specializing in psychology and, 
in fact, in their brand of psychology. 

The book may also be censured for a carelessness of production that 
is hardly excusable on the ground of wartime difficulties. There are 
many typographical errors, occasional misplaced footnotes, and one 
complete failure of correspondence between illustrative figure and text 
(421-424). The text refers in capitals to A, B, C, D, E, F, G and H, and 
to shaded areas HEG and GFH. The figure, if such it be, though it isn't 
so labeled, has the lower case letters a, b, c, d, e, f and g scattered over 
the page, and no shaded area. Finally, there is a two-page index of one 
column to the page with a total of 70 entries for the more than 700 
pages of material. 

A third weakness is an emphasis, unbecoming of a field struggling to 
assume a proper position in the natural sciences, on the armchair 
method. One may have no objection to the use of intuition and insight 
whenever possible, but if Klein, who writes in defense of this method, 
wishes to see some prime examples of what it would contribute to the 
advancement of knowledge, he has only to read the closely following 
articles by Maslow and by Brunswick. The first refers to some per- 
centages of strength of motivation (40) that are plainly imaginative, 
and the second is abracadabra. 

Lest this review suggest that there is nothing worthy of publication 
(or republication) in the book, I might mention that I enjoyed many of 
the original articles, including most of the review of his research by 
Schnierla, the article on conditioning by Harris, Brandt’s survey of his 
ocular photography (although a little more modesty might be in order 
in the descriptions of his apparatus), and, among the reprints, Har- 
rower-Erickson’s outline of Rorschach methods. Also, the article con- 
tributed by the editor is presented simply enough to indicate that he, at 
least, knew what general reader he had in mind. But these seem to be a 
meagre return for the heavy expenditure of time necessary to go through 
the whole volume. 

Geo. M. Peterson. 

University of New Mexico. 
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Since the beginning of the century there has been a volume of work 
studying spontaneous activity of animals, especially the rat and mon- 
key. The best review article is that of Shirley (97) in the Psychological 
Bulletin in 1929. Since that time the subject has not been reviewed 
comprehensively. Somewhat limited reviews appear in Munn (67), 
Morgan (66), and Gray (31). Other and even more circumscribed re- 
views include Richter (73, 74) reporting work done under his direction; 
Hoskins (40), describing some of the relationships between endocrines 
and activity; and Kreezer’s (52) summary of methods for measuring 
activity in the rat. A review of diurnal rhythms by Welsh (119) in 
1938 discusses much material not directly related to activity. Mettler 
(64) has reviewed and summarized studies on the effects of striatal in- 
jury in 1942. 

This article does not attempt to cover work done prior to Shirley’s 
review in this Journal. Several of the most important references to 
work done prior to 1929 are included, but the emphasis has been almost 
entirely on later material. ^ 

The Concept and Measurement of Spontaneous Activity 

There arc two points the reader should keep in mind as he proceeds 
through the paper. The first is a methodological issue. Much of the 
research to be reviewed depends on a general concept of spontaneous 
activity without regard to how the activity is measured. It will become 
evident in the course of the review that our concept of activity must be 
tied to the measure of it which we have used, for the results one gets 
with one measure of activity may be entirely reversed when a different 
measure is used. The second closely related point is a matter of termi- 
nology.- Since the largest amount of work has used animals running 
inside a drum, it will simplify things if the term activity, without any 
qualification, always refers to running activity, not to other measures 
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of activity. Measures other than those in an activity-drum will always 
be clearly distinguished. 


Method and Apparatus 

Running Drums, Animals and human beings indulge in spontaneous 
activity. This observation has been quantified in many ways. The 
animal most frequently used in experimentation is the rat, whose activ- 
ity is usually measured in what has been called an activity cage, but 
will henceforth be referred to as a drum or running drum. This device 
was first used by Stewart (110) and has been most adequately described 
by Slonaker (101, 102). It usually consists of two 10-13 inch circular 
boards mounted on a shaft and separated by a sheet of mesh wound 
around their periphery (86, 94). The rat runs inside the freely rotating 
drum, and a counter is attached to record the number of revolutions. 
Unfortunately, the usual system of measurement has shown only total 
activity, not activity as a function of time. 

Recognizing this inadequacy, Skinner (100) used a Harvard work 
adder in conjunction with a kymograph to get a summative record 
whose slope is a constant measure of activity. 

The drum has almost as many variations as there have been experi- 
menters in activity. Stewart's 20-inch diameter drum and the 26-inch 
diameter drum used by Park and Woods (71) represent one extreme, 
while Shirley (94) used a 10-inch diameter. Results reported in terms 
of number of revolutions are obviously not comparable when the diam- 
eter of the drums is not the same. Furthermore, equating the running 
by expressing it in distance traversed is of questionable validity in view 
of Farris' statement that rats in larger wheels run farther than those in 
smaller wheels (24). 

Depending upon the experiment, the rat may live entirely within the 
drum (94, 95, 101), have a separate living cage, or use supplemental 
diffuse activity cages (71). Since Richter (73) has shown that the num- 
ber of revolutions of the drum is reduced when the rat has a choice of 
several things to do, the results of different experimenters may not be 
comparable. 

The revolving drum has been the most extensively used laboratory 
instrument in investigating activity. Its physical variables have been 
discusvsed by Skinner (100) and Lacey (53). The reliability of the meas- 
ures obtained is remarkable — Shirley (94) reports a rank-order correla- 
tions ' f .97 for five-day totals of activity, and a split-half r of .90. 
Beach's (4) figures arc even higher (.98). Unfortunately a basic assump- 
tion in these results involves the equivalence of the measures. Lacey 
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(53) raises the very justifiable criticism that the measure may be show- 
ing only the consistency of the different drums. It is significant to note 
that in one case in which the animals were changed from one cage to 
another, the correlation reported was .80 (113). There are wide indi- 
vidual differences even between litter mates in normal rats with respect 
to running, some rats running 200 revolutions per day and others 
20,000. The pattern of running is set up by the tenth day or not at all. 
After this time the individual differences are relatively constant. 

The running drum has been used to indicate tension or motivation 
in the rat. Thus, Durrant (20) and Slonaker (106) have correlated run- 
ning with sex drives. Geier and Tolman (28, 29) have used running 
behavior to indicate increase in tension in the rat. 

Dorcus (16) devised a cage which moved slowly toward a goal object 
when the rat ran inside of it. 

Tambour- or Spring- Mounted Cages. Another apparatus for measur- 
ing activity is that first used by Syzmanski (114, 115) which consisted 
of a spring-mounted cage attached to a lever recording system. The 
disadvantage of lack of damping has been somewhat overcome by 
tambour-mounted activity cages (73, 109). The three supporting tam- 
bours are joined to one tube and record every movement on a kymo- 
graph. Both these methods produce records according to time, but 
records which are difficult to treat quantitatively because no ready 
means of determining the total activity is available. 

Hunt and Schlosberg (42, 43) counted the number of S-minute active 
periods occurring in such a cage over varying intervals of time, and 
Irwin (44) recorded the number of active seconds per minute in new- 
born children. Wilbur (121) used a spring mounted cage connected to 
a Harvard work adder to obtain a summative record (of the#activity 
of chicks) which is much easier to interpret. 

Smith (109), measuring audiogenic and electrogenic convulsive ac- 
tivity, supported a cage from four pneumographs or by one large flexible 
hydron bellows. Oscillation could be reduced by means of a small vent. 

Other animals have bepn used with appropriately modified cages to 
record activity. Monkeys have been fastened by a nine-inch chain to a 
2.5-inch rod, so that movement caused the rod to advance a counter 
(84). A monkey-sized pneumatically-mounted activity cage has been 
used by Kennard, Massimy and Chevallier (51, 62). 

Other Automatic Methods of Recording Activity. Another measure of 
activity has been suggested, incorporating a tilting box (7) in which the 
movement of the rat from one end of the box to the other advanced a 
counter. Claiming that the tilting motion would interfere with accurate 
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meawsuremcnt of rat's activity, Siegel ( 98 ) utilized the animal's motion 
from one end of a 22X6 inch box to the other to break a photoelectric 
relay and thus advance a counter. 

A horizontal turntable for exercising rats has also been used to record 
activity (21, 22). Since the distance the rat runs depends upon his 
proximity to the center of the turntable, it is probable that this device 
will not be popular in controlled experiments. 

Curtis (15), working under Liddell, reports the use of a pedometer 
to record activity of the sheep and the pig. Head-shaking in chickens has 
also been reported by Levy (58). 

Observational Method of Quantifying Activity. An observational 
means of recording activity has been used by Hall (32), Beach (3), and 
Fredericson (25). Hall recorded the distance traversed by rats iii a 
round open field eight feet in diameter. Beach noted which of 36 squares 
a rat entered upon in the ten minutes it was free in an area three feet 
square. Fredericson observed six classes of behavior indulged in by rats 
in a field two feet square. 

Spontaneous Activity as a Behavior Category 

Now let us take a moment to see what the methods just reviewed 
have to do with the concept of activity. Most of the authors tend to 
lump all manifestations of activity together and to pin one label on all 
of them — activity. This failure to distinguish types of activity in terms 
of its measure leads to a false concept of activity, for what data we have 
point to more than one type, or at least more than one aspect of activity. 

There arc, for example, wide individual differences in the running 
activity of rats but not nearly such wide differences in restless cage 
activity. Tainter (116) found that caffeine, mctrazol, and picrotoxin had 
no effect on running but did increase behavior measured in a diffuse 
activity cage. Hunt and Schlosberg (43) found only 9% decrease in 
diffuse activity with castration instead of the 98% found by Hoskins 
(40) for running activity. 

In light of these considerable differences, it seems logical that differ- 
ent terms should be used to distinguish the two devices and the be- 
haviors which they measure. Throughout this paper, the author has 
attemi)tcd to distinguish between running activity in the rotating drum, 
on the one hand, and diffuse activity in a cage or stabilimeter on the 
other As long as the terminology for these two distinct situations is the 
same, the notion will tend to persist that they are strictly comparable 
measures, which they are not. 
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Heredity and Age 

Genetic Basis. Rundqiiist (89) by selective breeding has been able 
to get active and inactive strains of rats. The active strain is less easy 
to purify than the inactive strain. Selection for running produced strains 
in which there were measurable: (1) increases in number of successful 
matings, (2) increases in sizes of litters, and (3) decreases in the gesta- 
tion period. The active rats also had a higher basal metabolic rate than 
the inactive strain. The selective breeding of these strains has been 
carried through 29 generations, with no change beyond the 12th (9, 
30). Brody has concluded that the two strains differ with respect to a 
single gene which acts as a dominant in males and a recessive in fe- 
males. This gene must act as an inhibitor, since none of the matings 
within the inactive strain produces active offspring, but on the other 
hand, active-strain matings produce individuals which vary from ex- 
treme inactivity to extreme activity. The genetic factors are somewhat 
obscured by environmental influences. 

Age. Running activity of rats increases w^ith age until the animals 
are about 80 days old, then is relatively constant until about 120 days, 
after which it gradually falls off till death (72, 95, 101). Richter (73) 
determined the amount of running, diffuse activity, and nest building as 
a function of age. The more active rats tend to have shorter lives than 
the inactive rats. 


The Internal Environment 
Nutrition 

The daily running activity of the rat increases just prior to the nor- 
mal time of feeding, even though the rat has been fed in what would 
otherwise be an inactive period (72, 103). This fact is probably expli- 
cable on the basis of hunger or metabolic changes connected with the 
daily 24-hour hunger rhythm initiated by constant feeding at a specific 
hour (120). Studies made by Richter (73) on generalized activity, how- 
ever, show that the rat very probably has a two-hour hunger rhythm if 
he is allowed to have food constantly available. 

If a rat is deprived of food for a period of time, its activity will tend 
to increase for as much as 96 hours. If deprived of food and water, it 
increases for 72 hours before it drops off (73, 117), probably due to 
weakness. Brobeck (8) showed that if food intake and environmental 
tempefaturc are held constant, there is a negative correlation between 
activity and weight gain. Smith and Conger (107) varied the diet of 
rats by keeping the caloric value constant but changing the proportion 
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of fat or protein. They found that up to 56% of the caloric value of food 
may come from fat without reduction in spontaneous activity. Fifty 
per cent of animal protein, however, induces marked reduction in 
running. Following ingestion of protein there is a marked metabolic 
rise which is not present following ingestion of fats. This result extends 
Slonaker’s (105) finding that a diet of 14 to 18% protein yields maximal 
spontaneous running. Hitchcock (39), however, fed rats eight grams of 
meat daily and observed little effect on running. 

A protein-free diet for short periods of time is effective in increasing 
activity. But a low-protein diet over a long period of time depressed 
activity (39). 

When allowed to select the amount of alcohol ingested, rats main- 
tained their activity, although the forcing of alcohol by inclusion in' the 
drinking water supply caused a reduction in activity (35). 

Vitamins 

Rats deprived of food, water, food and water, thiamin, and ribo- 
flavin showed increased activity until they were given ample amounts 
of the formerly scarce substance; then they were quiescent to a greater 
degree than normal until they had again ‘caught up’ (117). Limitation 
of Vitamin B is promptly followed by increased running activity in the 
rat; after five to ten days, however, activity is sharply decreased, long 
before any clinical signs of deficiency appear. This is accompanied by a 
drop In quantity of food ingested. When these rats are then given un- 
limited amounts of food and vitamins, activity decreases and remains 
low throughout the period of increased intake (5). These two papers do 
not completely corroborate Jackway’s earlier finding (47) that slight 
deprivation of Vitamin B complex, not severe enough to cause any 
symptoms except retardation of growth and slight roughness of coat, 
results in a decrease in voluntary activity. 

In a very suggestive paper Ziegler and Knudsen (124) shewed that 
young white rats fed on a rachitic diet during infancy ran less after 
recovery from the rickets than normals. If, however, the rats were 
deprived of Vitamin D at an even earlier age by feeding the rachitic 
diet to their mothers during pregnancy, the surviving pups were more 
active after recovery than normals. 

That deficiency is not always accompanied by increased activity 
is demonstrated by the fact that rats deprived of the magnesium ion 
show a consistent drop in activity without the initial rise usually ac- 
companying deprivation (117). Smith and Smith (108) observed that 
rats fed a diet low in inorganic constituents gradually declined in vol- 
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untary running. The reduction in running occurred long before the 
appearance of impaired running ability. 

Drugs 

There has been interest in the effects of drugs on behavior as showing 
perhaps quantifiable action similar to that qualitatively produced in 
human beings. In general, analeptics stimulate activity (6), but have a 
depressant or diminishing effect with prolonged use. Thus cocaine, 
benzedrine, ephedrine, and propadrine increased running for single 
doses, but effects of tolerance showed when the specific drug was ad- 
ministered for several days (116, 123). Caffeine, metrazol, and picro- 
toxin depressed running activity, but enhanced restless activity as 
measured in diffuse activity cages (116). In a subsequent paper, Shulte 
and Tainter (91) have shown the differential temporal effects of the 
administration of caffeine, coramine, and metrazol on diffuse activity. 

Running activity is increased by benzedrine sulphate and by Kola, 
but there is a decrement in activity just before feeding time 24 hours 
later (35, 92). 

Phenobarbital administered intraperitoneally (69) to white rats 
depressed the nighttime running level, but had no effect on the daytime 
level which is normally much lower than that of the nighttime. These 
injected rats ate less, drank less, and gained less weight. Even after 
withdrawal of phenobarbital, the experimental rats were less active 
than the controls (49). 

Other drugs used have been dinitrophenol and thiouracil. These are 
discussed below. 


Endocrines 

Adrenals. In a review of experiments dealing with endocrines up to 
1927, Hoskins (40) reports that entire removal of the adrenals loads to 
marked reduction in running. This operation is very difficult to perform 
successfully, but in all cases in which there was little reduction in activ- 
ity, subsequent autopsy showed hypertrophied adrenal particles. In 
12 out of 27 adrenalectomizcd rats, Richter (79) was able to restore 
running partially by implants of adrenal gland to the ovaries. Subse- 
quent removal of the grafts decreased activity again. Histological ex- 
amination showed vieible cortical tissue but no medulla. Increase of 
salt in the diet produced a definite but partial increase in activity. 
Ergographic studies show that while the absolute strength of muscles 
remains unaltered, the capacity for work following adrenalectomy is 
reduced to 8% of normal. Bilateral abdominal sympathectomy and 
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adrenal inactivation by section of the nervous connection reduced run- 
ning activity slightly for a period of 10 days with no apparent effect 
after that time (1). Lacey (54) injected adrenalin into rats and found a 
decrease in total diffuse activity although restlessness as indicated by 
behavior in the normally inactive period increased markedly. 

Thyroid. Hoskins (40) has reported no reduction in running resulting 
from thyroidectomy. This result has been questioned by Hall and Lind- 
say (34) who found 50% reduction using the same measure as that re- 
ported by Hoskins. Richter (75) has pointed out that the results ob- 
tained depend upon the amount of thyroid tissue remaining, for a very 
small piece is enough to sustain activity. Thyroidectomized rats may 
be re-activated by replacement therapy. However, if the quantity of 
thyroid administered is too great, the activity is once more reduced. 
That reduction in metabolic rate as such is not responsible for decrease 
in activity was shown by increasing the metabolism of thyroidectomized 
and normal rats by dinitrophenol (33). Inhibition of the thyroid by 
thiouracil resulted in slower growth, lowered metabolism, increase in 
the length of interocstral periods, reduction of spontaneous activity and 
disruption of rhythmic patterns of activity (61). Rundquist's breed of 
active rats had a higher metabolic rate than his inactive strain (90). 

Pituitary. Ablation of the pituitary leads to marked decrease in 
running activity. Injection of emulsion of fresh anterior lobe of the 
hypophysis reduced the decrease in activity, but only in one animal out 
of seven did this replacement therapy cause activity to remain at the 
original level. Viable hypophyseal transplants to the anterior chamber 
of the eye had only a slight stimulating effect (81). 

When the stalk of the pituitary is severed in the female rat, the 
normal* oestral periodicity is greatly slowed dowm. Five female rats 
showed ocstral periods separated by the following numbers of days: 
9.6, 11.2, 13.0, 14.3, 18.4. Richter explains this as being probably a 
whole multiple of the fundamental four-to-five-day ovulation cycle. 
This explanation is corroborated by the fact that ovariectomy elimi- 
nated all cycles (77). 

The genetically hypopituitary animal, the dwarf rat, possesses a di- 
urnal rhythm (70). 

Liver. Normal function of the liver is important in maintaining 
activity. Ligation of the bile duel caused almost complete reduction in 
animals in w^hich the duct remained occluded. Associated with this 
effect w^as widespread destruction of liver cells, considerable dilation of 
the bile duct, and other profound peritoneal effects (80). Partial hepa- 
tectomy was performed by Dugal and Ross (17) w^ho found that the 
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25% of the liver remaining took the same length of time to regenerate 
as did activity to return to normal. 

Pancreas, The decrease in activity following pancreatectomy is vari- 
able. Some rats show little loss, others great. This reduction is attrib- 
uted to the inability to metabolize carbohydrates (85). 

Ovaries, There are several measurable sex differences in daily run- 
ning activity. The female goes through a cycle of activity of about 
four to five days’ duration, the peak of which is reached at oestrus 
(23, 118). Hourly records of the female show variation in the distribu- 
tion of the activity as a function of the time in the oestrus cycle (103). 
This cycle may be abolished by bilateral ovariectomy (77, 87), when the 
activity in the drum is decreased by as much as 98%, but it is restored 
by ovarian transplants (87). The presence of the uterus is not essential 
to this activity cycle, as is shown by the fact that hysterectomy has no 
permanent effect on running activity (18, 19). If copulation takes place 
during oestrus, the activity is reduced until pregnancy and lactation are 
over and the female is again in oestrus as shown by cornified epithelial 
cells or receptivity. Stimulation of the cervix during receptivity leads 
to pseudo-pregnancy with about a 10-day marked reduction in running, 
followed by gradually increasing activity until the previous level is 
reached in about six days (104). 

The cycles of the ovariectomized rat may be restored in females or 
impressed in castrated males by several means, all of which involve 
supplanting the hormonal deficiency. Pregnancy urine of women seven 
to nine months pregnant, given in drinking water, was found to restore 
activity to near normal levels (78). Amniotin injected in spayed fe- 
males and castrated males had a similar effect (82). The earlier dis- 
crepancy between results of other investigators, reported by Sl]irley, is 
possibly explicable by the findings of Young and Fish (122). They were 
able to restore activity to levels prevailing before gonadectomy by es- 
trone administered in such a way that a constant source of the hormone 
was maintained. A certain small amount of estrone is necessary for the 
manifestation of running activity, but beyond that quantity, further 
estrone does not increase the activity nor alter its periodicity. These arc 
similar to Heller’s results (36) on male hormone. Castrated males show 
oestrus cyclic activity from ovarian transplants. Elderly males increase 
activity and lose weight following ingestion of an estrogenic fraction of 
chorionic gonadotropic extract (41). 

Testes, The running activity of the male is generally much less than 
that of the female, and lacks its cyclical characteristics. Complete cas- 
tration leads to marked reduction (about 98%) in running. Fractional 



402 


J. DAVID REED 


castration by Cans (27) of the following amounts of the testes of rats: 0, 
If IJf Iff 2, led to the following decrements in running compared with 
his controls: 20%, 11%, 26%, 48%, 79%. Probably there was not time 
for regeneration to take place. Castration before the 12th day after 
birth leads to no activity reduction in the adult rat (26) but this 
statement has been questioned by Richter (76) who found the reduction 
in running almost as great as in rats castrated after sexual maturity. 
Richter ascribes this difference to the relative inactivity of Cans’ 
controls. It is possible, however, that the findings of Hunt and Schlos- 
berg (42, 43) would bear on the question. They used general diffuse ac- 
tivity cages and found little difference between the number of five- 
minute active periods of normal and castrated males. The running 
activity of both male and female castrated rats is increased by testic- 
ular grafts, provided these grafts are ‘‘functional takes” (87). 

The relation between sexual drive and running activity is unclear. 
A male rat in a running drum near one female rat will show peaks of 
activity correlated with her oestral periods. If several females are in 
nearby drums, the total running of the male is increased but is no longer 
of the simple oestral periodicity (20, 106). On the other hand, the cor- 
relation between sex drive (as measured by copulatory tests and the 
Columbia obstruction box) and running activity is near zero (113). 
Rundquist (89) found that rats bred for activity were more fertile than 
those bred for inactivity. These facts are not contradictory but do 
require clarification. 

External Environment 
Light and Darkness 

The rat displays more running and more diffuse activity in darkness 
than in light. This rhythm is of a 24-hour periodicity and is remarkably 
stable. The daily activity periods arc determined by the internal 
rhythm of the animal, which rhythm is only gradually changed by the 
external conditions — light and temperature (10, 13, 14, 43, 48). Using 
a rhythm involving alternation of eight hours of dark and eight hours 
of light, Hemmingscn and Krarup were unable to abolish the 24-hoiir 
rhythm even in rats kept under these conditions from birth (37). By 
using six hour? of light and warmth, alternating with six hours of cooler 
darkness, Browman (14) was able to impress a 12-hour rhythm on about 
30 out of 32 rats. The rhythm was not stable, however, as shown by 
the fact that the longest period during which it persisted in any one rat 
was 37 days, in spite of the continual 12-hour cycle of light and heat. 

The female rat shows a four-to-five-day oestrous cycle of activity 
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which can be somewhat altered by lighting in that constant light leads 
to longer intervals between peaks of oestrous activity and lowered daily 
activity (37). Browman (10) even found evidence of constant cornifi- 
cation (but not receptivity) in some of his animals. Subsequently he 
demonstrated that the intact visual apparatus was necessary for this 
shift to take place (11, 12). Females who were blinded showed the same 
24-hour rhythm as those reared in constant darkness. Feeding schedules 
apparently had little effect on the running time. That the intensity of 
the constant light is important has been nicely demonstrated by John- 
son (48) who found that the amount of change in the period of maximal 
activity of mice was related to the quantity of cage illumination. These 
results were obtained with mice in diffuse activity cages, since the mouse 
apparently exhibits even more dichotomous activity cycles than the 
rat. Working with male rats in diffuse activity cages. Hunt and Schlos- 
berg (43) attempted to determine the source of the 24-hour activity 
rhythm; since castration did not abolish the periodicity they came to 
the conclusion that it was not based on the testes. Reversing the normal 
day and night periods caused the rats to shift their period of maximal 
activity in about a week. 

In constant light, the female rat will show peaks of activity which 
are correlated with a daily cool period (13). This is also true of blinded 
rats. The pre-oestrus activity readings come at the end of the 12-hour 
cool period. It had previously been shown that in a hot room, rats will 
reduce their activity slightly (13, 97). Below about 40 degrees Fahren- 
heit the rat's activity is reduced also. 

Temperature 

The effect of temperature on the diffuse activity of two-day-old 
mice was determined by Stier (111). These young mice are stilf poikilo- 
thermic so that control of external temperature had direct effect on the 
body temperature. It was found that different straight-line Arrhenius 
plots would fit the several measures used: amount of activity, quies- 
cence, and frequence of occurrence. Thus, he concluded that activity is 
a function of several factors* 

Activity Following Confinement 

The drive for activity hypothesized in several behavior theories has 
not been fully substantiated. Shirley (9S) reports increased running 
activity after a rat has been confined one to two days, but after ten days 
of enforced activity, running decreases. Siegel (99) found an increase in 
the number of interruptions of a light beam following close confinement. 
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It is important to note that in neither case is the amount of activity 
under conditions of no confinement significantly different from the 
increase reported after confinement. 

Skinner (100) has stated that ‘‘if any extensive activity is pro- 
hibited during part of a day, the remaining part shows a greater ‘den- 
sity’ of activity per unit of time.” Since he measured activity for a five 
to six hour period beginning at 3:00 a.m., it is possible that his conclu- 
sion is an artifact based upon the normal peak of the rat’s running ac- 
tivity which occurs at about 2:00 a.m. After this time there is a gradual 
decrease in the amount run, a rhythm which will persist in constant 
darkness. Without further work substantiating the generality of the 
piinciple of greater activity following inactivity, it would be parsimo- 
nious to ascribe Skinner’s results to the specific hours at which he re- 
corded activity. 


Miscellaneous Studies 

Other studies using the running drum which might be mentioned 
are : the correlation between activity and errors in learning a maze is iow 
(89, 96); while that between activity and time to traverse mazes is 
higher (68); rats given difficult discriminations run less than rats that 
are not made abnormal by difficult problems (60) ; a series of electro- 
convulsive shocks greatly reduced voluntary activity in rats (112). 

Neural Control of Spontaneous Activity 

The search for a neural center which controls activity has led to 
contradictory evidence. There is complete agreement that frontal 
lesions do increase activity and that bilateral lesions arc more effective 
than uni-lateral lesions. The hyperactivity usually takes two to three 
weeks to emerge. The animals on which most of the work has been done 
are rats, cats, and monkeys. 

The first observation of the effect of cortical lesions on activity has 
been attributed to several clinicians and experimenters. In 1920 Lashley 
(57) quantified the change in activity of the rat by using a running 
drum and noted that only frontoparietal lesions both increased the 
number of hours of running and decreased the lime spent in resting. 
Jacobsen (45, 46) noted increased restlessness and general activity fol- 
lowing frontal destruction in the monkey, and Langworthy and Kolb 
(56) described the behavior of cats with heightened restlessness. Since 
1937, considerable evidence has been gathered concerning the effect on 
spontaneous activity produced by lesions in the brain. 
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In the rat, the unilateral removal of the frontal pole did not appear 
to augment running activity. The inactive rats became hyperactive 
while the active rats did not increase their running relatively so much. 
Bilateral ablation of t^e frontal poles was much more effective in in- 
creasing running than the unilateral (3, 83). Beach (4) measured run- 
ning for 30 days before and SO days after electrolytic destruction of 
varying amounts of the corpus striatum. Activity increased postop- 
eratively in one animal, decreased in two, and was unchanged in two 
others. “No relationship between the magnitude of lesion and effects on 
activity could be determined. In the rat, the striatum evidently does not 
exert a controlling effect upon running activity as measured in this 
experiment.” On the other hand, Richter and Mines (84) found that 
monkeys with unilateral striatal lesions immediately had greatly in- 
creased activity, and Mettler (64) reports hyperactivity in cats follow- 
ing striatal lesions. 

According to Mettler (63, 64) when the striatum is injured, hyper- 
kinesia is the rule. He asserts that the striatum is an inhibitory mecha- 
nism; “ . . . stimulation of it produces inhibition and removal of it 
engenders evidence of motor release. It stands on the one hand between 
the cortex and the final common path as part of the route through which 
the cortex may exert an inhibitory effect and, on the other hand, it 
operates between the thalamus and lower motor mechanisms in the 
automatic inhibition incident to ‘unconscious activity*.” If the cerebral 
cortex is totally removed, the decorticated animal does not exhibit in- 
cessant activity but shows an inability to initiate or inhibit movement 
suddenly (65). 

Cats . 

Cats with bilateral one-stage removal of the rostral portions of the 
cerebral hemispheres were noted by Magoun and Ranson (59) to be 
almost continually walking about. Langworthy and Richter (55) re- 
corded increase in activity from 27 to 61 units from unilateral operation 
and to 399 units for the bilateral removal of motor cortex, premotor 
cortex, and possibly a small tip of the corpus striatum in cats. 

Monkeys 

In monkeys, Richter and Hines (84) found that bilateral removal of 
areas 8, 10, 11, and 12 had little effect on activity, while that of 9 did 
(62). ’On the other hand, Kennard and Ectors (50) reported increased 
activity following removal of area 8 alone. These results are not con- 
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tradictory in view of the method of measurement of activity. Richter 
and Hines (84) attached the monkey by a chain to a short steel rod pro- 
jecting from an axle. Movement of the monkey caused the rod to ad- 
vance a counter. The method of recording activity used by Kennard 
et al. was a pneumatically mounted diffuse activity cage. 

Generally the activity builds up in the course of two to three weeks 
following an operation. Ruch and Shenkin (88), however, report that 
lesions in area 13 (of Walker) consistently produce hyperactivity within 
the second post-operative day. Richter and Hines (84) also report such 
immediate hyperkinesia when the monkey has striatal lesions. 

Kennard et al, (51) have stressed the visual role in hyperactivity in 
monkeys. ‘‘Hyperactivity is markedly affected by visual stimuli. It 
disappears in the dark or when the animals have been deprived of vision 
either by enucleation of the eyes of by bilateral lobectomy. Absence of 
auditory stimuli has not the same effect.** 

A decrease in activity was noted by Harris (2) following “bilateral 
one-stage removal of the rostral portions of the neo-cortex of cats.” 
Kennard et al, reported that “hypermotility in monkeys and chimpan- 
zees is related to lesions of the rostral portions of area 6 and to area 8** 

( 51 ). 


Summary 

The literature of the last twenty years concerning activity in ani- 
mals has been reviewed. The methods, results and concepts of activity 
have been summarized and appraised. 

Several methods have been in use: 

1. The running drum: this apparatus yields high reliability for measures 
taken on a particular drum, but there are also inconsistencies from one drum 
to another and from one experimenter’s design of drum to another. 

2. The diffuse activity cage: a cage mounted on tambours or springs. The 
record which it gives varies widely from one cage to another. 

3. Several miscellaneous mechanical and observational methods, which have 
not been extensively used. 

There are considerable individual differences in running activity as 
well as variability of one animal’s activity from time to time. The indi- 
vidual differences are due in part to heredity, but are somewhat com- 
plicated by environmental influences. Intra-animal variability can be 
ascribed to several factors: running increases during hunger and most 
deprivations, during darkness and cool periods, and during oestrus. In 
various kinds of endocrine imbalance or deficiency, there is usually a 
decrease in activity — a marked decrease in the running drum but only 
a small decrement in the diffuse activity cage. 
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Running activity and diffuse activity are sometimes affected in the 
same way, sometimes differentially. Both reach a maximum during the 
cool or dark part of a 24-hour cycle. Some of the analeptic drugs stimu- 
late both kinds of activity, but other drugs may increase diffuse activity 
while decreasing running activity. 

Injury to the brain affects activity. In particular, lesions of the 
frontal cortex heighten activity, and bilateral lesions cause a greater 
increase than unilateral injury. Still, it is not yet clear whether there is 
a specific activity center in the brain and, if so, where it is. 

There is now a very large body of data concerning animal activity, 
but it needs further definition and interpretation. Particularly needed 
is a clarification of the concept of activity in relation to the method of 
measuring it. Most treatments of the subject tend to regard activity as 
a single entity. Yet, in some cases, where comparable measures of ac- 
tivity are available from different devices, running drum and diffuse 
activity cage, the results are not the same. Activity, it would then ap- 
pear, does not constitute a single behavior category which can be meas- 
ured with any instrument but must be considered, for the present at 
least, in terms of its method of measurement. 
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SAMPLING 4N THE REVISION OF THE 
STANFORD-BINET^SCALE 

ELI S. MARKS 
National Office of Vital Statistics 

In another paper (4) the writer attempts to point out the biases 
which may arise through types of sampling procedure quite common in 
psychological research. The present analysis is devoted to another effect 
of sampling methods commonly used in psychology — namely, the sub- 
stantial increase in sampling error which results when “cluster** methods 
of sampling are used. It should be noted that this is not a criticism of the 
cluster type of sampling. Cluster sampling is an extremely valuable 
device and makes feasible many studies which would otherwise be com- 
pletely impossible. However, the use of cluster techniques implies sub- 
stantial modifications in our formulae for sampling error and psycholo- 
gists are, in general, not familiar with these modifications. Unfortunately, 
much of the important work in the field appears in sources which are 
relatively inaccessible to psychologists. Ignoring the effects of cluster 
sampling on measures of sampling error has undoubtedly resulted in 
attaching importance to results which are statistically insignificant. In 
the testing field, failure to allow for cluster sampling has probably 
caused us to attach a measure of precision to our results considerably 
in excess of that warranted by sound statistical techniques. 

Cluster sampling almost always involves an increase in sampling 
error as compared with unrestricted random sampling of the same num- 
ber of cases. It is, of course, possible to obtain a lower sampling error 
with cluster sampling than with unrestricted random sampling if we 
make up our clusters for this purpose. However, the main reason for the 
use of cluster sampling is to permit the sampling of previously existing 
groups (the clusters) and, in most cases, the use of a previously existing 
grouping of the population involves a positive intraclass correlation of 
the variable studied, i.c., our existing groups arc almost always more 
homogeneous internally than groups of the same size made up by ran- 
dom selection of individuals from the population. It is the existence of 
positive intraclass correlation which cuts down the amount of independ- 
ent information available from a cluster sample of a specified size and 
occasions the substantial increase in sampling error usually associated 
with this sampling method. Fhe present analysis is designed to empha- 
size the substantial increase in sampling error which results from rela- 
tively small intraclass correlations. While this phenomenon is quite 
familiar to sampling statisticians, psychologists are rather generally 
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unaware of the marked disturbances of sampling error calculations and 
tests of significance introduced by the use of cluster sampling when a 
positive intraclass correlation exists. 

Although methods resembling cluster sampling are quite common in 
psychological research, very few psychological studies have used sam- 
pling designs which permit us to determine the standard error of the mean 
or of other sample statistics. As a matter of fact, it is difficult to find a 
study where analysis of the sampling error formulae used is not com- 
plicated by the presence of a non-measurable design (one in which the 
sampling probabilities are unknown). Some of the difficulties in the use 
of non-measurable designs are explored in a paper by McNemar (2) 
which discusses accidental sampling and purposive sampling as well as 
such measurable designs as unrestricted random sampling and stratified 
sampling. 


The Sampling Plan of the Stanford-Binet 

The writer has, therefore, not attempted to find a study with a 
measurable design, but has selected for analysis the sample used in the 
revision of the Stanford-Binet. This sample has been selected for analy- 
sis principally because the widespread use of the revised Stanford-Binet 
makes the problems involved in its standardization extremely important 
in spite of the lapse of a decade since the revision was completed. The 
revision of the Stanford-Binet is also a good example for our purposes 
because (a) it was an extensive project, involving a relatively large 
number of subjects and the expenditure of considerable amounts of 
time, effort and money and (b) the purposes of the sample were ex- 
plicitly formulated and clearly stated by the authors of the revised 
Stanford-Binet. 

The reader should bear in mind that the present analysis is on an 
‘‘as iP’ basis. The Stanford-Binet sampling design does not yield statis- 
tics with measurable standard errors and no amount of statistical ma- 
nipulation can overcome this defect. The cure lies not in statistical 
formulae but in more careful sampling techniques in future investiga- 
tions. However, the use of measurable sampling designs in psychological 
research will almost inevitably mean cluster sampling of some sort since 
any other approach will be beyond the limited resources usually avail- 
able to psychologists. Thus, an examination of cluster sampling, even 
on an “as if“ basis, is extremely pertinent to the future of any psycholog- 
ical research which involves statistical techniques. 

In my analysis I have relied entirely upon statements and data 
published in Terman and Merrill (6) and McNemar (3). Since the data 
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required for this analysis have not been published in full detail, I have 
been forced to use approximations at several points. Inquiry indicated 
that more detailed data could not be furnished without considerable 
expenditure of time and effort. Since the approximations used in this 
paper are satisfactory for purposes of illustration and since the sampling 
techniques used in the revision of the Stanford- Binet preclude a com- 
pletely accurate determination of error even if the detailed data were 
available, this deficiency is not serious. In nearly every case, the effect 
of the approximation used has been to understate the sampling error. 

In revising the Stanford-Binet, the major objective was to construct 
scales ‘‘so standardized for difficulty as to yield mean I.Q.'sof approxi- 
mately 100 at all age levels” (Tcrman, in 3, p. 3). The authors of the 
revision realized that their success in this objective was dependent upon 
securing a measure of the distribution of test scores in the general 
population (or in a satisfactory sample of the population). The sample 
was restricted to ‘‘American born” subjects of the ‘‘white race” in the 
age range from years to 18 years. Terman notes that ‘‘elaborate 
precautions were taken to make the sampling as representative of the 
entire population as circumstances permitted” (3, p. 6). 

According to Terman and Merrill this was done by selecting ‘‘17 
different communities in 11 states” (6, p. 12). They note that: ‘‘The 
selection of localities for the second year’s testing was based upon certain 
considerations in regard to sampling which had resulted from a study of 
the socio-economic level of the first 1500 subjects.” These considera- 
tions were what the authors viewed as an inadequate representation of 
the rural group and a difference between the occupational distribution 
of fathers of the cases tested and the occupational distribution of all 
employed U. S. males. In the second year’s testing, therefore, the au- 
thors of the Stanford-Binet revision ‘‘took care to include several addi- 
tional rural communities” (6, p. 14) Neither McNemar nor Terman and 
Merrill give further details on the method of selecting the communities. 
It seems e^'ident that selection was not on the basis of random sampling 
(neither simple random sampling nor random sampling within strata). 
As a matter of fact the term “community” is not defined clearly enough 
to permit a rigorous statement of the primary sampling units used. 
Nevertheless, we can visualize our population as being composed of 
“communities” (undefined but definable), so that the entire population 
of the United States can be broken up into a fairly large number (prob- 
ably over 3000) of communities. 

Within each community different procedures were followed for cases 
in the three age groups — \\ to S| inclusive; 6 to 14 inclusive; and 15 to 
18 inclusive. These groups were sampled as follows: 
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1. The group aged 6 to 14, Schools of “average social status*’ were selected 
in each community (method of selecting schools not further specified) and 
within each school all of the children between the ages 6 to 14 who were within 
one month of a birthday were taken, regardless of grade placement (6, p. 15). 
This sampling procedure is, then, a subsampling of subclusters with a 100 per- 
cent sample take within the subcluster. 

2. The group aged 15 to 18, Subjects were selected so that “the advanced 
group would be as nearly as possible continuous with the intermediate, with no 
break between fourteen and fifteen years. The compulsory school age was 
taken into account, the general character of the population,' and the type of 
secondary education that was offered. In each community the school census 
was consulted to determine the amount of elimination after age fourteen. We 
made certain that some of the twelve-, thirteen-, and fourteen-year-olds who 
had gone to high school were included, also some of the slow fifteen- and sixteen- 
year-olds who were still in intermediate school. A few cases who had graduated 
from high school were included and a few who had dropped out of school with- 
out completing high school. These out-of-school groups were sampled by choos- 
ing siblings of school children in numbers proportional to the amount of elim- 
ination at ages above fourteen.” (Sampling in this group was actually a rough 
type of “quota” sampling.) 

3. The group aged 1\ to 5\. This group was sampled in much the same man- 
ner as the out-of-school cases in the group aged 15 to 18. The authors “chose 
as far as possible younger sibs of the school groups.” Children were secured by 
use of birth records, school census, school siblings, kindergartens, well baby 
clinics, day nurseries, nursery schools and “personal report.” Use of the various 
sources differed from community to community. “Great care was exercised 
in the large population centers to include representative groups; if a school in a 
suburban district which had been chosen as average on the advice of superin- 
tendent and counselors seemed to include too large percentage of higher occu- 
pational groups it was offset by a tenement district center.” The authors state 
that “in the smaller communities, from seventy-five to eighty percent of the 
pre-school child population of appropriate age was examined” (6). In the pub- 
lished tabulations results for children aged \\ were omitted and further refer- 
ences deal qnly with the sample of children two years of age or over. 

It may be noted that the population sampled is limited to individ- 
uals within one month of a birthday (or half-year birthday for children 
under six). The population is also limited to American-born white 
persons and, in the age range six to 14, to children attending school. 
These limitations do not affect the propriety of generalization from a 
sample to the population so defined. The limitations may affect genera- 
lization from the sample to all native-born white persons aged two to 
18. This is not, however, of primary concern in this paper. Limitations 
on generalization resulting from the use of sub-populations are discussed 
by the vTiter in another article (4). For our present purposes, it is suffi- 
cient to accept the population, as defined. 

To summarize, the sampling plan of the Stanford- Binet revision 
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involved: (a) sampling of “communities*' from the aggregate of all 
United States communities; (b) the subsampling of schools for children 
aged six to 14 and taking all children (in the population as defined 
above) in the selected schools ; (c) the subsarnpling of other members of 
the defined population from the “community” without any intermediate 
subsampling of schools but with the use of a rough type of “quota” 
sampling. 

Biases and Variance in the Stanford-Binet Sampling 

The above is only an approximate statement since it is extremely 
hard to formulate exactly the sampling plan used. The method of selec- 
tion at each stage of sampling has not been specified above. It seems 
likely, however, that the sampling error of the plan used is greater than 
the error which would be involved in random sampling of “communi- 
ties” with equal probability of selection and no subsampling (i.e., a plan 
which would take all persons in the communities selected). 

It is obvious that a sampling plan not involving subsampling will 
have a lower sampling error for the same number of clusters sampled 
than a plan which did involve subsampling. The assumption that the 
community sampling actually used involved a larger sampling error 
than random selection is not as clear cut. Actually the sampling re- 
sembled “purposive sampling” or “quota sampling” but it does not 
appear to conform even to the rather loose requirements of these two 
techniques. 

In discussing purposive sampling Neyman (5) developed certain 
hypotheses which, if satisfied, would make the estimate commonly used 
in this method the “best linear estimate” (i.e., an unbiased linear esti- 
mate with variance less than that of any other linear estimate). Neyman 
notes that: 

If these hypotheses are not satisfied, which I think is a rather general case» 
we are not able to appreciate the accuracy of the results obtained. Thus this is 
not what I should call a representative method. Of course it may sometimes 
give perfect results, but these will be due rather to the uncontrollable intuition 
of the investigator and good luck than to the method itself. 

While the Stanford-Binet revision did not involve purposive sam- 
pling, Neyman’s remarks are applicable to the sampling plan. Further- 
more, there is internal evidence in the results of the Stanford-Binet re- 
vision which indicates that, in spite of the purposive attempt to secure a 
“reptesentative” sample, the Stanford-Binet revision sample actually 
produced a larger sampling error than w^ould have resulted from random 
sampling of clusters. 
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Table 3 below gives the number of cases from each of the communi- 
ties included in the Stanford-Binet sample. It should be noted that 37 
percent of the “urban” cases were drawn from San Francisco and 56 
percent of the “suburban” cases came from two California communi- 
ties. This means that 975 cases or 34 percent of the total sample were 
from California. In addition a disproportionately large number of the 
“rural” cases (41 percent) came from one community in Vermont. It 
would, of course, be possible to obtain clusterings in two states as 
marked as those shown by a random sampling of communities, but the 
probability of such an outcome is extremely small. It is almost certain 
that a random (or stratified random) sampling of communities would 
have given a better geographic distribution (and undoubtedly a lower 
sampling error) than was actually obtained. This fact is also pointed out 
by McNemar (3) who expresses “skepticism concerning the represen- 
tativeness of these communities.” 

It should also be remembered that the authors of the Stanford- 
Binet revision felt very definitely that their results contained a substan- 
tial bias. As noted above, the primary objective of the revision was to 
obtain a scale giving average I.Q.’s of 100 for each chronological age 
group. Terman and Merrill (6, p. 23) note that the mean I.Q.^s for 
their age groups run “slightly above 100” and state that this “is the 
result of intentional adjustment to allow for the somewhat inadequate 
sampling of subjects in the lower occupational classes.” McNemar 
(3, p. 20) states: 

The fact that the means in Tables 1 and 2 are above 100 should not lead the 
reader to the erroneous conclusion that the average I.Q. for the population now 
exceeds 100. The excess here observed is in the proper direction to allow for 
known bias in our age samplings. When an adjustment is made for bias in oc- 
cupational status, the age means approach nearer 100, and a further adjustment 
for inadequate rural representation would tend to bring the values still closer 
to 100. 

Table 6 on p. 36 of Terman and Merrill (6) gives average I.Q.’s for each 
age group “adjusted for 1930 Census frequencies of Occupational 
groupings.” These averages still show^ substantial bias, all means except 
those for ages 4 and 5^ being over 100 and seven age groups having 
average I.Q.*s over 103. The effect of rural-urban biasing influences is not 
presented. 

Since the method of correcting for bias is not stated, the effect of 
these coi rections on the mean square errors of the sample results cannot 
be determined. It is probably not possible to make this determination 
in any event since the presence or absence of biases in occupational or 
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rural-urban distributions cannot by themselves tell us whether an 
I.Q. distribution is biased or unbiased and correcting for rural-urban or 
occupational biases may have very little effect (or even an unfavorable 
effect) upon I.Q. biases. 

In any event, the original sample means of the Stanford-Binet re- 
vision contain substantial biases if the true population means are 100. 
These are shown by the figures in Table 1.* 


TABLE 1 


Average I.Q.’s by Age Groups for the Stanford-Binet Revision Sample 




Age Groups 

6-13 14-18 

All Cases 

From L — Mean 

106.58 

103.22 

103.03 

104.00 

Form M — Mean 

106.42 

103.96 

103.32 

104.43 

Number of Cases 

728 

1623 

619 

2970 


In view of the probable biases and the considerations with regard 
to the sampling method presented above, it is not at all unreasonable to 
assume that the Stanford-Binet revision sampling involved a larger 
standard error of the mean than would random selection of communi- 
ties with equal probability. Even if this is not the case, the subsampling 
involved should account for an increase in sampling error over a design 
in which there was no subsampling. 

On the basis of the above discussion, the standard error of random 
selection of communities with equal probability and no subsampling 
gives us minimum values for the standard errors of the Stanford-Binet 
sample means. To estimate these errors, we shall assume that the num- 
ber of cases actually sampled in each community was the total eligible 
population in that community. (As noted above, assuming that the 
community population was larger than the number sampled would lead 
to a larger estimate of tha standard error.) 


♦ The data are from McNemar (3) Tables 1 and 2. There are minor differences be- 
tween the results presented by McNemar and those presented by Terman and Merrill 
(apparently due to inclusion of some subjects in some of the distributions and their omis- 
sion in other distributions). The differences are minor and do not affect the present 
analysis. 

The data in this and in the two subsequent tables are reproduced with the permis- 
sion of Houghton Mifflin Co., the publishers of McNemar’s The Revision of the Stanford- 
Binet Scale. 
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The Standard Error for Cluster Sampling 

The standard error for the type of sampling described (i.e. "cluster 
sampling”) is given by: 


M — m 




{M - 1)ot MN^ ^ 

Or, when wc estimate from the sample, the estimated standard error 
is given by: 

m 

E Ni\xi - *0* 

M — m i 

= ( 2 ] 

Mm {m-\){Ny 

where Af = the total number of clusters (communities) in the population 
m = the number of communities sampled 

iV'i = the number of individuals (eligible for the population) in 
the i-th cluster 

^i = the mean LQ. for the Ni individuals in the i-th cluster 


'ZNiii 


'Em 


-=the mean LQ. of the population. (The aim of the 
sample is to estimate dc.) 


= the average number of individuals per cluster in the 

Af population. 

m 

x'= =the mean LQ. of the sample. (We are using this as 

our estimate of x and Si^ is the estimated standard 
* error of this sample mean.) 


ENi 

iV'= = the average number of individuals per cluster in the 

^ sample. 

To determine exactly wc need to know Af, the number of com- 
munities (clusters) in the population. While M is not known with any 
precision, we can be quite certain that it is large and that it is much 
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larger than m (at least 100 times as great). Consequently, we can, with- 
out appreciable error, take M—mlM equal to 1. With this substitution 
the square of the standard error is approximately equal to: 

Z Ni\xi - s!y £ Ni^Si - sfy 

m i 

**' m(m - l)(N'y ~ m - I 

All the data required for Equation [3] can be obtained from the 
sample. Unfortunately, not all of the sample data are available in pub- 
lished form. Since we shall have to rely on published data, some further 
approximations (described below) are necessary. The approximations 
also act to reduce our estimate of the standard error. 

McNemar (3) gives, as Table 9, information on the average I.Q.’s for 
children in ‘‘urban,” ‘‘suburban” and “rural” communities by age 
groups. This table, plus data for the entire group in the age range 2 to 
18, is p)resented in Table 2. The data for the entire group were calculated 
from the information given for the three age groups. 

TABLE 2 


I.Q. Data for Urban, Suburban and Rural Children* 



Urban 

Suburban 

Rural 

Total 

2-5i Year-Olds 

Number 

354 

158 

144 

656 

Mean 

106.3 

105.0 

100.6 

104.7 

S.D, 

15.7 

16.1 

15.4 

15.9 

6-14 Year-Olds 

Number 

864 

537 

422 

1823 

Mean 

105.8 

104.5 

95.4 

103.0 

S.D. 

14.7 

16.8 

15.5 

16.1 

15-18 Year-Olds 
Number 

204 

112 

103 

419 

Mean 

• 107.9 

106.9 

95.7 

104.6 

S.D. 

16.5 

15.7 

15.9 

16.9 

All Ages {2-18) 

Number 

1422 

807 

669 

2898 

Mean 

106.2 

104.9 

96.6 

103.6 

S.D. 

IS. 2 

16.5 

15.7 

16.2 


* Denver 2- to 5§-year-olds are excluded. 


To determine we shall take for our values of (the mean I.Q. in 
each community) : (a) the average I.Q. foi urban children for each of 
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the communities classified as ‘‘urban by McNemar; (b) the average 
I.Q. for suburban children for each of the communities classified as 
“suburban” and (c) the average I.Q. for rural children for each of the 
communities classified as “rural.” This approximation ignores all varia- 
tions between communities within the urban, suburban and rural groups 
of communities. As a result the values of which we obtain should be 
equal to or less than the values which would be obtained if we knew 
the means of each of the sampled communities.* 

There are some uncertainties in the published data concerning the 
values of m and Ni, As noted above, Terman and Merrill (6) state that 
17 communities were sampled in 11 states. This would give w = 17. 
However on pp. 36-37, McNemar (3) lists the communities sampled and 
the number of subjects in each community. McNemar lists 7 urban 
communities. He also lists 3 suburban communities and states that, in 
the suburban group, there were “four small communities just out of 
Kansas City in Johnson County Kansas, with 199 cases drawn from 
Westwood View, Hickory Grove, Roseland, and Shawnee Mission 
schools.” For the rural communities, McNemar states: 

The samplings from rural communities include 85 from Mount Washington 
School, Bullitt County, and Liberty School, Oldham County, Kentucky. A 
total of 152 were drawn from the following districts of Indiana: Prather School, 
Charlestown schools and Morgan Township School in Harrison County and 
Galena School in Floyd County. A farming region at Bloomington, Minnesota, 
supplied 92 cases; the farming and small village community of Randolph, Ver- 
mont, provided 275; and 65 subjects were secured in the vicinity of Atlee, Vir- 
ginia. We have already expressed some skepticism concerning the representa- 
tiveness of these communities. 

From this statement, it is difficult to determine the exact number of 
“rural communities” involved. At a minimum, there appear to be 8 
(assuming that schools in different counties represent different com- 
munities). If we also consider the four schools in the “suburban” part 
of Johnson County, Kansas, to be one community, McNemar’s listing 
gives a count of 19 communities vs. Terman and MerrilPs 17. The dif- 
ference appears to be one in the definition of community. In terms of 
independently selected areas, Terman and MerrilPs “17 communities” 
is probably more nearly correct. However, the data in Table 2 are based 
on McNemar’s classification. It appears desirable to adopt a compro- 
mise, counting as communities the cities and towns listed by McNemar 

• TVts statement cannot be made absolutely since, under certain circumstances, it 
may be incorrect. However, it is a fairly safe statement since the circumstances which 
would give a higher standard error through substituting group averages for individual 
averages are extremely unusual. 
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plus any schools in separate counties. This is the same basis we used 
in getting the count of 19 communities mentioned above. Since the 
count of independently sampled communities is probably Ter man and 
Merrill’s figure of 17, this handling of the problem operates in the same 
direction as the other approximations previously made. 

The difficulty in determining Ni occurs in the cases wdiere McNemar 
gives one figure for the number sampled in two different counties (e.g., 

TABLE 3 


Number of Cases Sampled in Each Community and 
Estimated Age Distribution 


Communities 

2-5\ 

Year-Olds 

6-14 

Year-Olds 

15-18 

Year-Olds 

All Ages 
(2-lS) 

Urban 

1. Denver, Col. 

28 

67 

16 

111 

2. Minneapolis, Minn. 

46 

111 

26 

183 

3. New York. N. Y. 

12 

29 

7 

48 

4. Reno, Nev. 

28 

68 

16 

112 

5. Richmond, Va. 

46 

114 

27 

187 

6. San Antonio, Texas 

63 

155 

36 

254 

7. San Francisco, Calif. 

131 

320 

76 

527 

Suburban 

8. White Plains, N. Y. 

31 

107 

22 

160 

9. Redwood City, Calif. 

26 

89 

19 

134 

10. Los Gatos, Calif. 

62 

209 

43 

314 

11. Johnson County, ICan. 

39 

132 

28 

199 

Rural 

12. Bullit County, Ky. 

9 

27 

7 

43 

13. Oldham County, Ky. 

9 

27 

6 

42 

14. Clark County, Ind. 

11 

32 

8 

51 

IS. Harrison County, Ind, 

11 

32 

8 

51 

16. Floyd County, Ind. 

11 

31 

8 

50 

17. Bloomington, Minn. 

20 

58 

14 

92 

18. Randolph, Vt. 

59 

174 

42 

275 

19. Atlee, Va. 

14 

41 

10 

65 


85 cases from Bullitt County, Kentucky and Oldham County, Ken- 
tucky). These cases can be handled by distributing the cases equally 
among the counties involved. This adjustment also operates to reduce 
the estimated standard error. A further approximation is necessary to 
get standard errors for the means of each of the three age groups in Table 
2. Mc'Nemar gives only the total number of cases in each community 
and does not give the distribution of these cases among the age groups. 
To estimate the standard errors for the separate age groups, the number 
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of cases for each of the communities was distributed by age proportion- 
ately to the age distribution in the class (urban, suburban or rural) in 
which the community falls. The number of cases in each community 
shown by McNemar and the calculated distribution of these cases by 
age groups is shown in Table 3. This adjustment affects only the esti- 
mates of the standard errors of the age group averages and not the 
standard error for the entire group aged 2 to 18. 

Comparison of Cluster Sampling Error with 
Unrestricted Random Sampling Error 

With all the adjustments reducing the standard error which have 
been made, it may seem surprising that we have any error left. How- 
ever, a fairly substantial amount of sampling error remains. Table 4 
shows the standard errors of the mean I.Q. calculated as described 
above (using Equation 3) compared with the standard error obtained 
by the formula usually used in psychological research studies, i.e.: 


where 


m Ni 



and 


iV' = E Ni. [6] 

i 

In Equations [4], [5] and [6], Xij stands for the value (I.Q.) of the 
jth individual in the ith cluster (community) and the other symbols 
have the meanings previously defined. Equation [4] represents the 
standard error of the mean of a sample drawn by unrestricted random 
sampling from an infinite population (i.e. a sample drawn so that the 
probability of drawing any observation in the population is equal to 
and independent of the probability of drawing each of the other obser- 
vations). 

It will be seen from Table 4 that the absolute values of the standard 
errors c'alculatcd by Equation [3] are not large. There is a sampling 
error of only 1 per cent in the average I.Q. for the entire group of 2,898 
cases. However, a very substantial difference exists between the stand- 
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ard error by Equation [3] and the standard error by Equation [4]. If 
we apply Equation [4] to determine the standard error of the mean of a 
cluster sample, it is obvious that we shall be very far from the correct 
value (in this case we would get an error which is less than one-third of 
the correct figure). 

This fact is extremely important in applying tests of significance to 
differences of sample means. For example, suppose we took a sample 

TABLE 4 

Estimated Standard Errors of the Mean I.Q.'s for Cluster 
Sampling and Unrestricted Random Sampling 


Standard Errors Ratio of S,E. of Cluster 


Age Groups 

Cluster Sampling 

Unrestricted 
Random Sampling 

Sampling to S.E. of 
Random Sampling 

2-5 J years 

.60 

.62 

.97 

6-14 years 

1.09 

.38 

2.89 

15-18 years 

1.35 

.82 

1.63 

All Ages (2-18) 

1.01 

.30 

3.36 


of 900 children aged 2-18 (by a method which was actually random) 
from some city or other population subgroup. Assume that this sample 
gives us an average I.Q. of 105.7 on the revised Stanford-Binet and our 
sample has a standard deviation of 1 8, so that the standard error of the 
mean (using, quite properly. Equation [4]) is .60. Our group has a 
mean 2.1 points above the average of 103.6 for the Stanford-Binet re- 
vision sample shown in Table 2. We want to know whether this dif- 
ference is significant. If we assume unrestricted random sampling of 
the Stanford-Binet revision sample, we would use .30 (see Table 4) as 
the standard error of the revision sample mean. This would give us .67 
as the standard error of the difference of 2.1 and our difference would 
be 3.1 times its standard error. We would undoubtedly consider this a 
significant difference. Actually, the standard error of the mean of the 
revision sample is at least 1.01, which makes the standard error of the 
difference 1.17. The difference is actually only 1.8 times its standard 
error and can hardly be considered significant. 

The sample used for the Stanford-Binet revision is not an extreme 
case of the error which can be made by applying formulae based on un- 
restricted random sampling to data obtained by cluster sampling. The 
sampling for the Stanford-Binet revision did involve testing individuals 
from several communities and the standard error for cluster sampling is 
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only 3 times the error for random sampling of the same number of in- 
dividuals. Many studies use data from one or two groups (e.g.) elemen- 
tary psychology classes in two neighboring colleges) to draw conclusions 
about the whole population (all college students or even all human 
beings). In this case the standard error obtained from Equation [3] 
ma 3 ^ be 50 to 100 times greater than that obtained from Equation [4]. 
Use of the ‘‘correct” formula (“correct” if we have used a random 
process for drawing our groups) will make supposedly significant dif- 
ferences vanish more rapidly than a quart of ice cream at a children’s 
party. 


Intraclass Correlation 

The reason for the difference between the standard error for un- 
restricted random sampling and that for cluster sampling is to be found 
in the fact that individuals are not sampled independently in cluster 
sampling. If we consider samples of equal size from the same population, 
the standard error of the mean in unrestricted random sampling is 
multiplied by approximateh’ (l+iVp) when we use cluster sampling. 
Here N is the average size of our clusters and p is the intraclass correla- 
tion (a measure of the extent to which individuals within a cluster re- 
semble, or are “correlated” with, each other). The intraclass correlation 
usualb^ ranges from 0 to -f-l (although it can be negative). It can be 
seen that even very small values of the intraclass correlation (say, .01) 
can have a very substantial effect on the standard error of a mean in 
cluster sampling if the clusters arc moderately large (iV^ = 100 or more). 
As a matter of fact, the estimated intraclass correlation for the entire 
samj>le (all individuals aged 2 to 18) used in the Stanford- Binet revision 
is onl^’ .08. A recent pai)er by Walsh (7) gives some of the probability^ 
considerations involved in tests of significance when intraclass correla- 
tion is present. 

'rhere is one feature of Table 4 which may arouse some interest. 
This is the fact that the e^stimated standard error (using Equation [3]) 
of the mean I.Q. is larger for the group aged 6-14 years than for the 
grouj) aged 2-5 vj y ears. This is, of course, contrary' to what we w'^ould 
expect from the descriplion given of the sampling proce.ss. To some 
extent this peculiarity^ n^sults from our ignoring subsampling in calcu- 
lating the slandarcl errors. Consideration of subsampling variation 
would probably increase the standard errors somcwdiat and would prob- 
ably- i'HTease th(‘ standard error more for the group aged 2-5^ years 
than for the grouj) aged 6-14 years (since there are fewer of the younger 
children). As a matter of fact, inclusion of subsami)Iing variation might 
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double the standard error for the mean I.Q. of the group aged 2 - 5 - 2 - but 
would probably not increase the standard error of the group aged 6-14 
more than 10 per cent 

Actually Table 4 shows a lower standard error of cluster sampling 
for the group aged 2-5 1 years than for the group aged 6-14 years vbe- 
cause there is less variation among the average I.Q.’s of the urban, 
suburban and rural children for the younger group. This fart may be 
due to some basic relation between I.Q. variability and age. For ex- 
ample, McNemar (3) gives a table for adjusting I.Q.'s for differing 
standard deviation of the I.Q. at various ages. He bases this table on 
the differences actually found in the sample. 

Another explanation of the differences in variability between age 
groups is to be found in the selective nature of the sampling for the 
Stanford-Binet revision. Selective sampling seems to have been par- 
ticularly important in the pre-school group. In another article (4), the 
present writer points out some effects of selective sampling on group 
means and also notes that selective sampling will usually affect the 
standard deviation also. It would be very unwise to hypothesize about 
the difference between age groups shown in Table 4 unless we had much 
more information about the sampling probabilities. 

This article has used the Stanford-Binet only as an Illustration of 
the dangers of ignoring the intraclass correlation when we are dealing 
with a cluster type of sampling. In view of the qualifications placed on 
our analysis, it is not possible to draw any conclusions about the relia- 
bility or unreliability of the revised Stanford-Binet as a measuring in- 
strument. There may be good reasons for supposing that the precision 
of the revised Stanford-Binet is considerably less than many of its users 
assume. From the sampling standpoint, the sample design used jn the 
revision of the Stanford-Binet was a non-rneasurable design and there 
is no way of telling how ‘‘bad” or “good” the results were. It has been 
suggested that the sampling errors shown in Table 4 an' probably mini- 
mum figures. However, the results do offer a possibility of improving 
the sampling design in the event that the Stanford-Binet is revised 
again in the future. An error of 1 I.Q. point in the average I.Q. mav 
not be too serious. If this is the case, the biases in the Stanford-Binet 
average I.Q.'s could probably be removed by using a sound sample 
design without any need for an increase in cither the number of com- 
munities covered or the number of subjects tested. If greater accuracy 
than a mean correct within 1 per cent is considered necessary or de- 
sirable, this could probably be achieved by increasing the number of 
communities sampled without increasing to any great extent the total 
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number of subjects tested. As a matter of fact, increasing the number 
of subjects tested would probably add very little to the accuracy of the 
final results fat least for the age group 6-14 years). The standard error 
of a mean in cluster sampling decreases (approximately) in proportion 
to the square root of the number of clusters sampled. The standard 
error shown in Table 4 for unrestricted random sampling is .3 of an I.Q. 
point. The standard error for cluster sampling is 3.36 times this value. 
Therefore, to get a standard error of .3 using cluster sampling, w^e would 
need about 11 times as many communities or about 200 communities. 
This estimate of the number of communities required is, of necessity, 
unreliable, since we were forced to estimate our standard errors from a 
sampling plan which is actually non-mcasurable, and measuring the 
non-measurable puts an obvious strain on epistemology. 

In designing a sampling plan for a revision of the Stanford-Binet 
recent developments in sampling theory and practice can be used to 
increase accuracy without increase in survey costs. The reader’s at- 
tention is directed particularly to the work of Hansen and Hurwitz (1) 
in this field. Using the techniques developed by Hansen and Hunvitz, 
persons revising the Stanford-Binet would probably get satisfactory 
precision from a well-designed sample of 25 to 100 communities with 
only a very small increase (if any) in the total number of cases tested. 

Summary 

This article stresses the dangers of Ignoring the intraclass correlation 
of the population when “cluster” methods of sampling are used. The 
increase iu sampling error resulting from cluster sampling is demon- 
strated by an analysis of the results of the sample used in the revision 
of the Stanford-Binet. This sample actually yields "non-measurable” 
results, i.e. results which do not permit determination of the standard 
error of the sample mean. However, it is estimated that the standard 
error of the average I.Q. of this sample is at least 3 times the error which 
would be calculated by the use of the formula for unrestricted random 
sampling from an infinite population. The latter formula is the one 
familiar to psychologists and the one usually used by them regardless 
of the type of sampling involved. The illustration indicates that very 
substantial errors may result from this practice and that many results 
will be considered statistically significant where such a conclusion is 
entirely unwarranted. 
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Appendix 

Although the formula for the standard error of the mean for cluster sampling 
is not new, psychologists are generally unfamiliar with it. The derivation of 
this formula is, therefore, presented below. The z transformation will be found 
useful in deriving standard errors for more complicated designs (e.g. designs 
using stratification, subsampling, differential sampling probabilities, etc.). 

Equation [2] gives the mean square error (square of the standard error) of 
the mean of a cluster sample as: 


Sjt’ — 


JIf — w * 

Mm {m — 1)(F')* 


The mean of the sample is: 


x' = 


i 

m 

T.Ni 


M, m, Ni, Xi, N\ X and N are defined on p. 420 It is also convenient 
to define: 

Xii = value for jth individual in fth cluster 

Ni 

Xi ^ Xij = sum of the values for all individuals in ith cluster. 

7 

From their definitions, it can be seen that: 

Xi 


= 


Ni 
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2 

X = • 

m 

i 

can be treated as a ratio of two linear functions of the sample observa- 
tions, namely: 


M ^ jjf m ^ 

fix) = — ^Xi - Xij 

m i m i j 

M ^ ^ m JVi 

/(iv) = -z^i = -2:zi. 

Mi mi i 


In deriving 54/*, it will be useful to prove the following theorem: 
Theorem: If we have a sample estimate: 


fiy) 

where f(x) and f{y) are linear functions of the sample observations Xh and 
ynih-i, 2 • • • «) and if: 

X = Ef{x), y = Ef{y), g* = — - — 

X y 

then: 

where o-f/* is the mean square error of r' and r is the population parameter of 
which r' is an estimate. 

Proof: When we have a sample estimate r' =f{x)/f{y), the mean square 
error of r' can be found by: (a) expanding r' as a Taylor series around x and 
y (the expected values oi f{x) and/(y)); (b) subtracting r (the true value of r' 
for the entire population) from both sides of the equation; (c) squaring both 
sides of the resulting equation and (d) taking the expected value of the re- 
sultant. If we ignore, in our Taylor series, terms involving partial derivatives 
higher than the first, the result of this operation will be: 


(Tr 


/2 


£(r' - r)2 

\y / \ y / \ ** y* xy ) 


17] 
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If we let 


Xh yh 

Zh=^ 

X y 

and if / is a linear function, then : 


and 



/(y) 

y 


[ 8 ] 


X y X y 


Therefore ; 




x^ 


2Q'/(x)/ (y) 

xy 


[9] 

[ 10 ] 


ffr 


,2 = 



[ 11 ] 


The above theorem can be applied to derive the mean square error of as 
follows: 


where 


and 


m 


M ^ M J!L ^ 

fix) = —Y,Xi or /(*)= — ]C Z Xij 
nt i mi! 


M M m Ni 

m = - Z m = - Z E Nij where Ni, = 1 

mi' m i i 


EKx) = i:xi=x 

i 


EfW =='£Ni = N. 
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^ Xi Ni ^ Xij 1 

Zi- — -— or ~ 

” M 

„ — 1)*< — 

M ^ Mi m i 

fiz) = — x; z.- = — 

m i X N 


By Equation [11]: 




where X is the population parameter estimated by £' and is: 


Ziv.- 


Therefore: 




Since/(z) is M/m times a sum of the sample values z<; 




M\M - f») ? 

(M - l)m M 


— = Ef{z) = 0. 

M 


An unbiased estimate of (r/(,)’ from the sample is: 


V(.)* = 


- »n) ? 

Jf w w — 1 


* The result is the same whether the z transformation is applied to the cluster totals 
or the individual observations. 
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where 


2 *< £ 

i i i 


tn mX mN 
From Equations [13], [14], and [16] we have: 


■'- 6 )' 


xyM\M-m) : 


S (z< - 2)* 


(Af — \)tn 


M 


and 


/ ^ 53 (*<“*')* 

" w 


Mm w — 1 

In Equation [17] we substitute the values: 


M 

'LNi 


Zi = > z = 0, F = 

X N M 


and get: 




{M -m) 
{M - \)m 




MN^ 


or 


[171 


[18] 


[19] 


M — m 




(M- \)m MN^ 


[ 20 ] 


We make the same substitutions in Equation [18] and also substitute for 
f and W the sample estimates: 
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wJ^. 

M 


'£Ni 


m 


This gives: 


or 



2 : NiKxi - sfy 

i 

(m - 1)(F)* 


[ 21 ] 


m 



In some cases, cluster sampling may introduce a substantial bias Into the 
sample standard deviation (when the sample S.D. is used as an estimate of the 
population S.D.). This bias will be practically eliminated by use of the estimate: 

= a,® + 5*'® [23] 

where o’, is the sample S.D. and 5 , is an estimate of the population S.D. 

Equation [23] can also be used for estimating the population S.D. from a 
sample with unrestricted random sampling. 



ILLUMINATION STANDARDS FOR EFFECTIVE 
AND EASY SEEING 

MILES A. TINKER 
University of Minnesota 

The problem of artificial illumination is of primary importance in all 
inside working environmcntn. To maintain healthful and efficient func- 
tioning of the eyes, it is necessary to provide adequate lighting. Un- 
questionably, proper illumination contributes much to comfort and 
efficiency in activities of daily life. Working under faulty illumination 
frequently results in eyestrain which tends to be accompanied by reflex 
functional disturbances of other organs. 

During recent years a “lighting consciousness** has been forced upon 
a large portion of the population, particularly upon those who do con- 
siderable visual work under artificial light and upon those who must 
decide upon the illumination requirements of schools, offices, factories 
and other situations where visual work is to be performed. Although 
interest in lighting has been stimulated by popular articles, advertise- 
ments, and “educational pamphlets** — as well as by reports written by 
educators and medical men — the more fundamental information has 
appeared as experimental reports in scientific publications. The result 
of exposure to this material is a keen interest in illumination and a 
sincere desire on the part of the public for sound information concern- 
ing hygienic lighting. The natural tendency is to consult pamphlets on 
recommended practice when lighting specifications are needed for a 
particular situation. Frequently, the applied psychologist will be called 
upon to furnish advice on proper illumination. In many instances he 
will be asked to evaluate the materials presented in the recommended 
practices. Consequently, the applied psychologist should be informed' 
concerning the adequacy of the data from which the lighting specifica- 
tions in the recommendations arc derived. 

The first code on lighting was issued by the Illuminating Engineering 
Society in 1915. In the more recent publications, the codes arc known 
as Recommended Practice of Home Lighting, of Office Lighting, etc. 
These pamphlets have been prepared by the Illuminating Engineering 
Society either alone, or jointly with the American Institute of Architects, 
usually under the rules of procedure of the American Standards As- 
vsociation. Although the American Psychological Association has been 
in existence for over SO years, and even though applied psychologists 
have been interested in the field and have been making experimental 
contributions to the hygiene of vision for over 40 years, neither psy- 
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chology nor psychologists are represented in the group specifying recom- 
mended practices. Furthermore, a large body of psychological literature 
has been ignored, either because the illuminating engineers were not 
familiar with it or because they chose not to use it. The result has been 
an emphasis upon the engineering aspects of lighting with inadequate 
attention to certain psychological factors. More recently there has been 
some attempt to consider more of the psychological factors. Perhaps 
because engineers lack a psychological background, interpretations are 
frequently erroneous. Probably the most satisfactory approach to hy- 
gienic lighting could be achieved by coordinating the contributions of 
engineers, physiologists, and psychologists. 

Recent editions of recommended practices reveal an increased em- 
phasis upon control of direct and reflected glare, brightness contrast, 
and the diffusion or distribution of light. The tendency to specify rela- 
tively very intense light for many visual tasks is prominent. The pur- 
pose of this paper is to present a critical examination of the specifications 
in the more recent editions of recommended practices and to scrutinize 
some of the data from which the recommendations were derived. 

Spectral Quality of Light 

In general, spectral quality of light receives adequate treatment in 
recommended practices (3S, 36, 37, 38). It is stated that with equal 
foot candles of illumination, variations in color quality of light found in 
common illuminants have little or no effect upon the visual discrimina- 
tion involved. When color is to be discriminated, it should be viewed 
under as close an approximation of daylight as possible. Luckiesh (10) 
has a valuable discussion of light and color. 

Quality of Lighting 

Recommendations (35, 36, 37, 38) concerning control of glare, dif- 
fusion, direction and distribution of light, light reflection value, and 
effects of finishes on ceilings and wall are ordinarily quite satisfactory. 
Visual discrimination is improved by moving the glare source away from 
the line of vision and by reducing the brightness of the light source and 
the amount of light emitted by the light source toward the eye. Bright- 
ness of luminaires should be low in value. High brightness contrasts 
within the field of vision should be avoided whether on the work surface 
or in other parts of the visual field. Proper diffusion of light helps to 
eliminate undesirable shadows. Purely local lighting, therefore, is un- 
satisfactory. Since the reflection factors of objects in the visual environ- 
ment play an important role in illumination, the finish of ceilings, walls, 
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floors and furnishings is important. These surfaces should provide 
reflecting surfaces to help spread the light about the room. Further- 
more, they should be such that undesirable brightness contrast does not 
occur within the field of vision. Shiny or glossy finishes should be 
avoided to prevent specular glare. 

In the recommended practices, informative discussions on classifica- 
tion of lighting systems are usually included. Also illustrations of fix- 
tures and installations are sometimes given. Some attention is given 
to daylight illumination and the need of coordinating artificial with day- 
light lighting. 

Intensity of Illumination 

Intensity of illumination receives by far the greatest emphasis in 
specifications. With each revision of a lighting code prepared by il- 
luminating engineers, the foot candle recommendations for a given situ- 
ation rise. One may well question whether this trend has a scientific 
basis, or whether the consumer has been educated to accept the higher 
intensities. In 1934, Luckiesh and Moss (11) presented general recom- 
mendations which they considered to be very conservative. These are 
repeated with slight changes in Luckiesh’s 1944 book (10). He adds 
that these are inadequate in many cases where hundreds and even 
thousands of foot candles of light are desirable. Examination of the 
recommended practices of lighting reveals that, for the most part, they 
are based upon researches done and interpretations made by Luckiesh 
and his co-workers, or upon researches inspired by them. Let us turn 
first, therefore, to these reports. 

In Light, Vision and Seeing, Luckiesh (10), and in the New Science 
of Seeing, Luckiesh and Moss (11), make the following foot candle 
recommendations for common tasks of the work-world : 

1. 100 foot candles or more are specified for severe and prolonged visual 
work. Examples include fine needle work, pen work, engraving and assembly, 
and discrimination of fine details involving low contrast. 

2. SO to 100 foot candles should be used for proof-reading, difficult reading, 
watch repairing, and average* sewing. 

3. 20 to 50 foot candles are listed for such visual tasks as clerical work, 
ordinary reading and average sewing on light goods. 

4. 10 to 20 foot candles are proposed for ordinary reading and sewing on 
light goods when the task is not prolonged. 

5. 5 to 10 foot cardies are needed for visual work which is more or less inter- 
rupted or casual. 

6. 1 to 5 foot candles are sufficient for perceiving large objects. 

Luckiesh (10) states that these are minimum foot candle recommen- 
dations and that he considers them to be very conservative from the 
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viewpoint of case of seeing. Furthermore these foot candles, according 
to Luckiesh and Moss (11), arc far below the intensities of illumination 
which new^ knowledge indicates to be ideal. 

These recommendations arc derived from various sets of data which 
will be discussed in turn. 

Preferences for light intensity. Luckiesh and Moss (11) cite data on 
preferences for light intensities to support their contentions that high 
intensities are necessary for adequate seeing. The mean choice was 
about 100 foot candles but the median was SO foot candles when up to 
1000 foot candles were available. Tinker’s analysis (22) of light prefer- 
ence studies indicated that visual adaptation plays an important role 
in determining the preferences. In an experimental check, Tinker (26) 
found that when readers were adapted to 8 foot candles, the median 
choice for comfortable reading was about 12 foot candles. But when 
adapted to 52 foot candles, the median choice was 52 foot candles. It is 
obvious that the intensity of illumination to which the reader is adapted 
plays a dominant role in his illumination preference. The conclusion is, 
therefore, that preference for illumination intensity is not a satisfactory 
method for determining the intensity of light needed for efficient visual 
work. 

Visual acuity. Luckiesh and Moss (11) and Luckiesh (10) list 
visual acuity as a basic factor in reading (and presumably in other 
visual work). It is true enough that visual discrimination docs depend 
somewhat upon visual acuity. But is visual acuity an adequate criterion 
for prescribing appropriate lighting? Luckiesh and Moss (13) admit 
that in many tasks the criterion of visual acuity is relatively inap- 
propriate, e.g. in tasks involving low contrasts. But they point out that 
for black test objects on a white background, visual acuity improves up 
to 100 foot candles. As a matter of fact, Lythgoe (15) has shown that 
under certain conditions of measurement, visual acuity improves up to 
and beyond 1000 foot candles. Inspection of the data reveal that the 
knee of the curve of improvement is at about 10 foot candles and that 
beyond about 20 foot candles the gains are slight. It must be kept in 
mind that in measuring visual acuity, one is dealing with threshold 
values. It is highly questionable whether the almost microscopic gains 
in visual acuity obtained under the high foot candles justify their appli- 
cation to visual tasks where supra-threshold visibility is involved as in 
most everyday situations. Furthermore, data reveal that the visual 
acuity curve is practically horizontal from 50 foot candles to the higher 
levels. 

Luckiesh and Moss (11) and Luckiesh (10) cite data on visual acu- 
ity for 1,10, and 100 foot candles only. If they really desired to find the 
foot candle level beyond which no practical gains in visual acuity occur, 
they should have investigated the range between 10 and 100 foot can- 
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dies. As shown in Tinker’s reviews (29, 31), this criticism may be aimed 
at all the basic data presented by Luckiesh (10). In some instances 
(decrease in heart rate, decrease in convergence reserve of ocular 
muscles), data for only 1 and 100 foot candles are presented. This pro- 
cedure is inexcusable in experiments designed to determine how much 
light intensity is needed for efficient visual work. It appears, then, that 
visual acuity data arc of only slight use for prescribing illumination 
intensities for visual discrimination in supra-threshold tasks. If ac- 
cepted, there is no justification for suggesting that more than 40 to SO 
foot candles are necessary for adequate discrimination even for tasks 
that approach threshold discrimination. 

Visibility measurements. Luckiesh (10) states that “After establish- 
ing a standard of visibility or desirable see-level to be attained if pos- 
sible for all tasks, it is seen that specifications of light and lighting and 
other aids to seeing can be based upon visibility measurements." The 
measurements are to be made by the Luckiesh- Moss Visibility Meter. 
This is a device consisting of two identical circular gradients which are 
rotated before the eyes to alter the brightness contrast of the object 
whose visibility is to be measured. It, therefore, reduces the object to 
threshold visibility. It is the threshold which is measured. Three as- 
sumptions are made: (a) Two objects are equal in visibility when both 
are barely visible, (b) “Two objects are equally above threshold visi- 
bility when their visibility has been increased by the same increase" in 
size, brightness, brightness contrast or time, (c) “The visibility of an 
object, or degree of supra-threshold visibility, is proportional to the 
decrease in any one of the fundamental factors necessary to reduce the 
object to threshold visibility." These assumptions are considered to be 
axiomatic and arguments against them are considered to f utile. Nev- 
ertheless, since recommended standards are based upon visibility 
measurements to a large degree, it seems desirable to examine the 
matter further. Things arc not axiomatic just because some one says 
they are. 

Since visibility measurements are in terms of threshold values, they 
are analogous to visual acuity measurements. They are subject, there- 
fore, to the same criticisms as visual acuity measurements as criteria for 
prescribing illumination standards. 

Luckiesh (10) emphasizes foot candles for equal visibility in pre- 
scribing illumination intensities. For example, to make newspaper text 
matter equivalent in visibility to 8 point book type on white i)aper under 
10 foot candles of light, it is necessary to use 30 foot candles. And to 
make the 1/64" divisions on a steel scale equal to this visibility level, 
180 foot candles are needed. Are these levels of illumination intensity 
requfred for efficient and comfortable seeing? Luckiesh (10) assumes 
that this is a conservative standard. On his empirical scale, the 8 point 
type with 10 foot candles has 48 per cent maximum visibility. (Maxi- 
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mum visibility is obtained from a test-object whose critical detail has a 
visual size of 20 minutes; a critical detail of 1 minute is the smallest 
visible for persons with normal vision under 10 foot candles of light.) 
But no adequate experimental check is made for performance of these 
tasks under various levels of illumination. Tinker (27) found that the 
critical illumination level (the intensity beyond which no further change 
in reading performance occurs as the intensity is increased) for reading 
7 point newspaper type to be approximately 7 foot candles. It is difficult 
to conceive the need of going above 20 foot candles to provide a margin 
of safety above the critical level. It is highly probable that an experi- 
mental check will reveal that other visual tasks, like discriminating the 
divisions on a steel scale, do not require the 180 foot candles indicated 
for efficient vision by the computations of Luckiesh. Related to this is 
the question of comfortable vision. Harrison (8), in discussing the diffi- 
culty of using high intensities because of the introduction of glare fac- 
tors states '‘Visibility and comfort are two separate factors which do 
not always overlap completely.” 

No one will deny that visibility is an important factor in ease of 
seeing. But to prescribe standards in terms of scores derived from 
measurements made with the Visibility Meter is open to serious ques- 
tion. The basic data are threshold scores. While the derived scores may 
appear logical, supra-threshold seeing is not the same phenomenon as 
threshold seeing. Apparently, as illumination intensity is increased, one 
soon reaches a level of diminishing returns where further increase is of 
no practical importance or may introduce harmful factors from the 
viewpoint of easy and comfortable seeing. 

Nervous muscular tension, Luckiesh and Moss (11, 12), place great 
stress upon the apparent decrease in nervous muscular tension during 
reading as the illumination intensity is increased from 1 to 10 to 100 
foot candles. Tinker's (22) analysis of their data reveals that the method 
employed to present their results magnifies minute differences so that 
they appear large. Interpolation shows only gradual changes from 10 
to 20 to 25 foot candles and very slight changes from there on to 100 
foot candles. The conclusions that high foot candles are needed for 
ordinary reading is not valid. In a comparable situation, Tinker (25) 
found that for reading 10 point type, the critical intensity was about 3 
foot candles. Below this level, rate of reading was retarded and fatigue 
increased, but for higher intensities there was no change. For people 
with normal vision, 10 to 15 foot candles should provide a satisfactory 
margin of safety for reading legible print. 

Frequency oj blinking. Another favorite criterion employed by 
Luckiesh and Moss (11, 12) and Luckiesh (10) as a basis for prescribing 
illumination intensities for visual work is frequency of blinking. The 
typical experiment is to measure the rate of involuntary blinking for the 
first and for the last five minutes for an hour's reading under 1, under 
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10 and under 100 foot candles of light. They note that the blink rate is 
greater under the 1 than under 10, and greater under 10 than under 100 
foot candles. Therefore, it is concluded that relatively high intensities 
are desirable for reading. Even if these data are accepted as valid, we do 
not know where between 10 and 100 foot candles the curve of increased 
efficiency flattens out since intermediate intensity values were not 
studied. But there are several sources of information which suggest that 
blink rate is not a valid criterion of ease of seeing: 

1. McFarland, Holway, and Hurvich (18), after a searching analysis of 
their own extensive experiments and of other studies, state: “A high blink-rate 
need mean neither an increase in fatigue nor an increase in difficulty of seeing.** 
They conclude that “the rate of blinking can hardly be considered as a valid 
index of visual fatigue.” 

2. Tinker (32), in a study that has some bearing on the subject, found that 
frequency of blinking is an inadequate criterion of readability of print. 

3. Bitterman (1), working with 3 and 91 foot candles of light, found that 
when subjects read for 40 minutes there was no significant difference in rate of 
blinking. In fact the frequency of blinking was slightly greater under the 91 
foot candles. Incidentally, Bitterman also found no significant difference in 
blink rate for reading large type vs. small type. His results, therefore, indicate 
that rate of blinking cannot be employed as an index of ease of visual work. 
Further studies by Bitterman and Soloway (2, 3) showed that frequency of 
blinking is unrelated to duration of visual work or to the presence of a relatively 
intense glare source in the visual field. The reports of McNally (19) and Mac- 
Pherson (16) also cast doubt upon the validity of blinking as an index of ease 
of seeing. 

4. The statistical treatment employed by Luckiesh and Moss (11, 12, 14) 
upon their data is open to severe criticism. Tinker (28, 29) has questioned the 
appropriateness of the geometric mean which they employ in most comparisons. 
The same criticism is raised by Hoffman (9). In a searching analysis, Hoffman 
also severely criticizes the use of the percentage technique employed by Lucki- 
esh and Moss for presenting data, and for basing conclusions on percentage 
differences rather than on raw score differences. Percentage scores are notori- 
ously unreliable. Furthermore, if the raw scores are below 100 (as most of 
them are), percentages magnify the differences. When percentages are used, 
therefore, the observed differences may be largely an effect of the derivation. 
Insignificant raw score differences may seem large when put into percentages. 
For instance, a typical average of 30 blinks during 5 minutes of reading is in- 
creased 10 per cent by a change of 3 blinks. Hoffman further points out that 
work decrement may be a more important variable than illumination changes 
in the results of Luckiesh and Moss. In general, he found little support for the 
contention that relatively high intensities are needed for effective and easy 
seeing. 

5. Eames (5) criticizes Luckiesh and Moss (14) for using relatively few sub- 
jects in their experiments (including blink rate studies) and for employing 
“test wise’* subjects. As pointed out by Eames, “People who take tests re- 
peatedly in a given field gradually learn what is expected of them” and are un- 
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intentionally influenced by this knowledge. Results obtained under such con- 
ditions cannot be representative of the reactions of the general population. 

The accumulated evidence indicates that rate of blinking cannot be 
accepted as a criterion for specifying intensities of light for visual work. 

Decrease in heart rate. Luckiesh (10) and Luckiesh and Moss (11, 
12, 14) cite data on change of heart rate while reading for one hour 
under 1 foot candle and under 100 foot candles of light. No data are 
presented for intermediate levels of illumination. It is stated that 
heart rate decreased 10 per cent under the 1 foot candle and 2 per cent 
under 100 foot candles. The conclusion was that from the viewpoint of 
case of seeing the 100 foot candle level is desirable. An experiment by 
McFarland, Knehr and Berens (17) was designed to check the findings 
obtained in Luckiesh’s laboratory. The results led to the conclusion 
that ‘Tt is questionable whether reliable criteria for determining ade- 
quate levels of illumination for tasks such as reading during short peri- 
ods of time (approximately 2 hours) can be obtained in terms of . . . 
heart rate ....*' Another check experiment was carried out by Bitter- 
man (1), who recorded heart rate while reading under 3 and under 91 
foot candles of light. “The results do not support the conclusions of 
Luckiesh and Moss with respect to the value of heart ratc“ as an index 
“of the ease of visual work.” In view of the above evidence we must 
reject heart rate as a criterion for prescribing illumination intensities for 
visual work. 

Decrease in convergence reserve, Luckiesh and Moss (11, 12, 14) and 
Luckiesh (10) cite data on decrease in convergence reserve of ocular 
muscles after reading for one hour under 1 and under 100 foot candles of 
light. The decrease was less under the 100 foot candles. No data are 
given for the range between 10 and 100 foot candles. We do not know, 
therefore, whether the 100 foot candles is significantly better than such 
levels as 20 or 30 foot candles. 

Visual adaptation. Throughout their writings, Luckiesh and Moss 
(10, 11, 12, 14) emphasize that the eyes evolved under daylight levels 
of illumination and suggest the desirability of competing with daylight 
by artificial means. They consistently ignore the fact that the eyes 
readily adapt to easy and effective seeing over a wide range of illumina- 
tion intensities. 

Summary on intensity of illumination. Examination of the data em- 
ployed by Luckiesh and Moss as a bjisis for specifying foot candle levels 
for visual work reveals a general lack of validity of these results as 
criteria for ease of seeing. The data from visual acuity, muscular tension 
and ^'^sibility measurements are misinterpreted or misapplied. The 
blink technique and rate of heart beat must be rejected because of lack 
of confirmation by independent workers. Furthermore the methods of 
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statistical analysis employed are frequently at fault. Any science of 
seeing based upon such an unstable foundation must, therefore, lack 
validity. Since these data have been the justification for specifying 
what appear to be excessively high levels of illumination intensity, we 
must reject such specifications unless justified by valid evidence from 
new experimentation. 


Lighting Codes 

School lighting. The American Recommended Practice of School 
Lighting (35) specifics the following minimum foot candles in service: 
IS for classrooms, shops and offices; 25 for sewing and drafting rooms; 
and 30 for sight-saving classes. There is general agreement on the im- 
portance of hygienic illumination in reading and study situations. The 
recommended foot candle levels seem satisfactory in view of research 
findings other than those cited in the code. There should be, of course, 
a sound experimental basis for recommendations of this kind. Tinker 

(23) has pointed out that the recommended practice for school lighting 
is based upon conclusions derived from misinterpreted experimental 
results. Fortunately, the recommended practice is adequate in spite of 
inferences from inadequate data. 

In a later publication by Sturrock (21), the foot candle levels are 
not in an approved code but are listed as the levels found desirable in 
the experience of successful business institutions, i.c., good present-day 
practice. For schools the foot candles listed include: 30 for study halls, 
class rooms, general laboratories, general manual training; 50 for draw- 
ing room, close work in laboratory, sight saving classes; 100 (considered 
especially low) for close work in manual training, and in sewing rooms. 
It is obvious to the impartial person who knows the field that these 
suggestions ref)resent more intense illumination than is necessary for 
adequate seeing in the school situation. Data summarized by Tinker 

(24) and additional experimental evidence (25, 27) indicate that about 
15 foot candles «arc adequate for ordinary schoolroom tasks and that 
25 to 30 foot candles are satisfactory for the more severe tasks. Justifi- 
cation for the higher intensities is sought in the discussions of Luckiesh 
and Moss (12, 14) and Luckiesh (10). These have been evaluated above. 

Office lighting. The Recommended Practice of Office laghling (36) 
includes the following foot candle levels: 50 for difficult seeing tasks 
such as accounting, bookkeeping, and drafting; 25 for ordinary seeing 
tasks such as general office work, private office work, mail rooms; 10 
for casual seeing tasks such as reception rooms and washrooms; 5 for 
simple seeing tasks such as halls and stairways. Considering the se- 
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verity of the tasks performed by some workers in general offices and 
special (as accounting) offices, the above recommendations are satis- 
factory. The 50 foot candles, however, should be considered liberal even 
for the difficult seeing tasks. The statement that “higher values will 
contribute gi'eatly to accuracy, speed and ease” cannot be accepted as 
valid. 

Sturrock’s (21) summary of good present day practice does not devi- 
ate markedly from the recommended practice except that typing and 
prolonged reading of shorthand notes are listed at 50 foot candles and 
intermittent reading and writing at 30 foot candles. Each of these is 
about twice what is needed in terms of the visual task. The basis for 
the higher intensities is in terms of the discussions of Luckiesh and 
Moss (12, 14). The inadequacy of these data has been pointed out 
above. 

Industrial lighting, A wide range of illumination intensities is 
recommended for various tasks in industry (37). Among the higher foot 
candle recommendations are; over 100 foot candles for such operations 
as extra fine assembly, automobile finishing and inspecting, cutting and 
sewing dark goods, engraving, proofreading, final inspection of tire 
casings, grading and sorting tobacco products, and certain inspection 
work in textiles; 50 to 100 foot candles for such operations as automobile 
assembly line, glass works inspection, fine inspection, bookkeeping, font 
assembly-sorting in printing industry, tin plate inspection, and stitching 
dark leather. With regard to all the recommendations, one is cautioned 
that the foot candles are minimum operating values and that in almost 
every instance higher values may be used with greater benefit. 

It is stated that the recommendations are taken from a scries of 
studies on the illumination needs of specific industries, or, if not avail- 
able there, from current good practice. Examination of these studies 
(listed on page 23 of the report) indicates that in the main they are 
surveys rather than experiments. Furthermore, there is a lack of ade- 
quate descriptions of the survey techniques employed. In a few in- 
stances a general description of methods was given. Apparently what 
happened was first to make a survey of practice. This was followed by 
some sort of job analysis to determine what had to be discriminated. 
Then by reference to research studies (such as those reported by Luck- 
iesh and Moss in their books) the intensity level of illumination pre- 
sumably needed for the specific job was deduced. This method has 
some virtue provided sound data are referred to, which was not done in 
these cases. In a few instances it is stated that visibility measurements 
were made. Occasionally installations to achieve the recommendations 
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were made, the effect observed and additional modifications made. In 
no case was there experimental determination of the light intensity 
needed. 

There are no valid experimental data which indicate that more than 
50 foot candles are needed even for those practical visual tasks which 
approach threshold discrimination. Furthermore, as pointed out by 
Harrison (8), visual comfort may decrease under high intensities. 

Home lighting. The most recent recommended practice for home 
lighting (38) specifies intensities ranging from 10 foot candles on card 
tables to 100 and more for sewing on dark goods. Forty foot candles are 
recommended for such situations as children’s study table, kitchen work 
counter, laundry, and for prolonged reading. There is no valid reason 
for going above 25 to 30 foot candles for the more severe visual tasks in 
the home (24). Approximately IS foot candles is adequate for many 
of these visual tasks. Figure 1 in Recommended Practice of Home Light- 
ing (38) is misleading. “This chart shows the extent to which occupa- 
tions and poor seeing conditions leave their mark on eyesight.” The 
implication is that poor illumination causes ocular disability. There are 
no valid data which indicate this to be so. This chart represents an un- 
justified form of propaganda. 

Present-day practice, Sturrock (21) has assembled foot candle levels 
of illumination which are labeled “good present-day practice.” The 
tables are preceded by a classification (after Luckiesh and Moss) of foot 
candle needs for visual discrimination of tasks varying in difficulty. The 
material is apparently designed as a guide but is not necessarily in the 
form of recommendations. This sort of thing is valuable in many ways. 
But since it is based to a considerable degree upon the material pre- 
sented by Luckiesh and Moss (12, 14) and by Luckiesh (10), the illumi- 
nation intensities are excessively high in some instances — as 100 foot 
candles for sewing and proofreading, and 50 foot candles for reading 
small type and for kitchen counters. It should be pointed out, however, 
that much of the material is fairly satisfactory. 

In general, rccommend<*d practice prior to 1940 (35) is fairly ade- 
quate, but as new codes are issued at later dates the apparent tendency 
has been to recommend as intense lighting as the traffic will bear. This 
is justified by referring to the work of engineers (largely Luckiesh and 
Moss) who state that these high intensities are nevertheless inadequate 
for easy seeing. As pointed out above, both the experiments and the 
conclusions which are cited as fundamental are frequently invalid. Fur- 
thermore the data are out of line with other independent experimental 
results. 
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Eye disabilities. It is generally accepted that eyes with disabilities, 
even when corrected by glasses, need brighter light than normal eyes for 
adequate visual discrimination. Ferrcc and Rand (6) and Ferrec, Rand 
and Lewis (7) are usually cited as supporting evidence. In the first 
study (6), it was found that apparent diopters of accommodation in- 
creased more for 14 presbyopes than for normal eyes in going from 1 to 
5 to 25 foot candles of light. Interpolation indicates that for the normal 
eyes the curve of improvement shows little rise after about 8-10 foot 
candles; for the presbyopes, after about 15 foot candles. In addition, 
one myope and one presbyope were compared with a normal subject by 
measuring apparent diopters of accommodation at 13 intensities from 
0.5 to 100 foot candles. The curve of efficiency for the normal person 
improved rapidly to 5 foot candles, then more slowly to about 20 and 
very gradually thereafter; for the myope there was considerable im- 
provement to about 20 foot candles and little thereafter; for the pres- 
byope there was considerable improvement to about 38 foot candles, 
and then slower impro\cment to 100 foot candles. It is of course in\- 
possible to generalize from one case^ but apparently those with eye dis- 
abilities need somewhat brighter light than normals for clear seeing. 
This does not mean that they need 100 foot candles or more, as some 
people wish to imply. 

In the other study (7) Ferree, Rand and Lewis were concerned with 
distant (20 feet) vision. 'Hie visual acuity for 4 presbyopes was com- 
pared with acuity for 3 normal people. The presbyopes continued to 
gain in visual acuity from 25 to 100 foot candles while the normal eye 
made little gain within this range. Since there is little or no relation 
between acuity of distant vision and acuity at near vision, these results 
have no bearing upon visual discrimination at the work surface (desk, 
work bench, etc.). Furthermore, one should not prescribe illumination 
for suprathrcshold tasks in terms of threshold measurements (visual 
acuity). There is no evidence from these studies which implies that 
excessively high foot candles are necessary for those with ordinary visual 
disabilities. Rather, they suggest a moderate increase for those with 
corrected vision as compared with normal eyes. 

Visual adaptation. It is w(?ll established that the eyes readily adapt 
to easy and effective seeing over a wide range of illumination intensities. 
This adaptation is rather slow in going from bright to dimmer illumina- 
tion (for practical purposes, 15-20 minutes) and rapid in going from 
dim to bright illumination (1-3 minutes). Tinker (25) has demonstrated 
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that when adaptation is incomplete on shifting to a lower level of il- 
lumination, speed of perception is retarded. When adaptation is ade- 
quate, however, visual perception in reading is fully effective from 3 
foot candles up for normal eyes in reading legible print. In another 
study. Tinker (26) showed that subjects tend to prefer for reading ap- 
proximately the illumination intensity to which they have been adapted, 
whether it be 8 or 52 foot candles. These data indicate that readers 
tend to consider comfortable for easy reading any one. of a wide range 
of illumination intensities f)rovided such intensities are above critical 
levels and provided visual adaptation is adequate. Codes of lighting 
have consistently ignored the role of visual adaptation in seeing. They 
carefully point out that the eye has evolved under the bright illumina- 
tion of daylight, but do not mention that the eye also evolved to see 
adequately at low as well as at high intensities of light. 

Illumination for Adkquatk Seeing 

Critical levels of illumination. The critical level of illumination is the 
intensity beyond which there is no further increase in efficiency of per- 
formance as the foot candles become greater. Tinker (24) has sum- 
marized the data for critical levels of illumination: for reading of legible 
print (about 10 point on good paper) by adults, it is approximately 3 
to 4 foot candles; for reading and study of children, 4 to 6 foot candles; 
for arithmetical computations, loss than 9.6 foot candles; for sorting 
mail, 8 to 10 foot candles; for the exacting task of setting six-point type 
by hand, 20-22 foot candles; and for very fine discrimination required 
to thread a needle, 30 foot candles. In a later study, Tinker (27) found 
the critical level of illumination for reading newspaper print to be about 
7 foot candles. Employing intensities from 2 to 55 foot candles, Rose 
and Rostas (20) found that reading efficiency, in terms of speed and 
comprehension, did not increase by a measurable amount with increased 
intensity of illumination. 

Adequate levels of illumination. It is obvious that visual work should 
not be done at critical levels of illumination. There should be an ade- 
quate margin of safety to provide for individual variation and the like. 
For such visual tasks as reading good-sized print (10 to 11 point) on a 
good quality paper, i.c., print of good legibility, 10 to 15 foot candles 
should provide hygienic conditions when one's eyes are normal. For 
situations comparable to the reading of newsprint, 15 to 20 foot candles 
should be adequate. In situations involving the reading of handwriting 
and other comparable tasks, 20 to 30 foot candles seem desirable. For 
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tasks comparable to discrimination of 6 point type, there should be 30 
to 40 foot candles. And for the most severe tasks encountered in work- 
day situations, 40 to 50 foot candles will be found adequate. There is 
no valid experimental evidence now available that indicates a need for 
over SO foot candles intensity for adequate visual discrimination. The 
intensity values from 10 to 20 should be increased somewhat (5 to 10 
foot candles) for eyes with slight disabilities or for those with correc- 
tions. For the higher values, however, no practical gain will be achieved 
for these people by increasing the intensity. The above suggestions hold 
for school children as well as for adults. In general, the child has much 
loss severe visual tasks than adults. 

Intensity of illumination cannot be prescribed without coordinating 
it with other factors such as distribution of light and brightness contrast. 
A good example of the uselessness of excessively bright light is found in 
the study by Darley and Ickis (4). They were concerned with vision 
in the drafting room, a very severe visual task. In comparing 30 with 
75 foot candles of indirect light, they found the efficiency ratings for 
the two to be only slightly different. When they compared 40 with 80 
foot candles of direct light (troffcr) under conditions of no reflected 
glare, they also found no significant differences in the efficiency ratings. 
The observations of Harrison (8) are relevant here. He points out the 
danger of glare with installations of 50 foot candles and above of arti- 
ficial illumination. 


Summary 

Examination of the literature upon which lighting recommendations 
are based reveals that some techniques of experimentation are invalid, 
and that interpretations from certain other data are unwarranted. Some 
of the recommendations are adequate, others are not. The trend seems 
to be to specify as high intensities as the traffic will bear and at the same 
time to suggest to the consumer that if he uses still higher intensities 
he will improve his ease of seeing. All will agree that there should be 
sufficient light for adequate seeing. It is high time, however, that the 
consumer know what is adequate and what is surplus. As pointed out 
by Winslow (34), illumination should conform to real human needs. It 
is human health and comfort which are at stake. 

In general the recommended practice concerning distribution of 
light, brightness contrast and color of light is satisfactory. 
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ON FESTINGER'S EVALUATION OF SCALE ANALYSIS 
LOUIS GUTTMAN 

Department of Sociology and Anthropology, Cornell University 

The theory of scale analysis had its origin some seven years ago. 
Since that time, especially by virtue of extensive and intensive research 
done in the Army, some of its further ramifications have been explored 
and several techniques have been devised for carrying an analysis out 
in practice. The power and incisiveness of this approach have been 
demonstrated in numerous attitude and opinion surveys made in the 
past several years, as well as in studies of achievement tests. A pleasing 
feature has been the simplicity of the techniques involved. 

Most of ihc material, with respect to both applications and theoreti- 
cal developments, is as yet unpublished. A manuscript has been pre- 
pared by Edward A. Suchman and the writer which will give the first 
comprehensive statement of both the theory and practice of scale analy- 
sis. This manuscript will form part of the four volumes soon to be pub- 
lished by the Social Science Research Council on the w^ork of the 
Research Braiich, Information and Education Division of the War De- 
partment. These volumes will also provide many illustrations of how^ 
scale analysis has been used for practical problems. Meanwhile, some 
brief statements of the principal concepts and instructions for practical 
procedures are available in article form to those who wdsh to use this 
approach in their own research (see the bibliography below). 

On the basis of some articles which have been published and of some 
mimeographed progress reports, Festinger (1) has recently attempted a 
survey and evaluation of scale analysis. Since his survey is not bavsed 
on all the information available, it is admittedly tentative and incom- 
plete. In addition, full advantage has not l)(‘en taken of the material 
which Festinger used as his sources; he raises a number of points which 
have already been answ^cred there, and also introduces erroneous inter- 
pretations and conclusions. 

It seems worthwhile to discuss at the present time some of Festing- 
er’s criticisms in order to help clarify the issues and to correct some 
important misapprehensions. Attention is also called to some articles 
that have appeared since Festinger prepared his paper, discussing vari- 
ous aspects of scale analysis (10. 11, 13). 

Three of Festinger’s points wall be analyzed here: (a) criteria for 
scalability, (b) techniques of analysis, and (c) the use of scale analysis 
in practice. In the course of the discussion, some other aspects will be 
brought out which Festinger has not considered. 
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Criteria for Scalability 

Reproducibility, The main purpose of scale analysis is to test the 
hypothesis that a universe of qualitative items can be represented by a 
quantitative variable. In order for the universe to be represented ex- 
actly by a quantitative variable, each item must be a perfect function 
of that variable, or be perfectly reproducible from it. Thus the concept 
of reproducibility is paramount in scale analysis. 

In practice, only a sample of items is used from the universe of con- 
tent. Furthermore, in practice, it is not expected to find perfectly re- 
producible or scalable universes. Among other things, perfect repro- 
ducibility implies perfect test-retest reliability, which is certainly not 
to be expected empirically. However, if the reproducibility of the entire 
universe is very high, say over 90%, then that may be sufficient for 
many practical purposes. A quantitative variable which will represent 
an indefinitely large universe of items that well will ordinarily not lose 
much predictive power, whether used for predicting outside variables 
or whether predicted from outside variables. This will especially be true 
if the errors of reproducibility are random. 

Since universe reproducibility must be estimated on the basis of only 
a sample of items, it becomes evident that the sample’s reproducibility 
alone may not be a sufficient guide. Festinger criticizes the sample re- 
producibility coefficient for its inadequacy.* This inadequacy was recog- 
nized at the outset in scale analysis. The same kind of examples that 
Festinger uses (1, pp. 156-157), showing how five or nine statistically 
independent items can have high reproducibility, were worked out previ- 
ously; several such examples will appear in the forthcoming volume. 
Indeed, there is an even worse case than that of statistical independence, 
namely that wherein some items have negative relationships with others; 
this is worse than being statistically independent from the point of view 
of scale analysis. Examples can be constructed showing how even in 
this case it is possible to have supriously high reproducibility in a small 
sample of items. 

Festinger omits to point out that this problem about reproducibility 
was raised before, and that several answers have already come forth. 
In one of my mimcograplicd reports to which Festinger refers (15), there 
is the following question and answer: 

Q. Is reproducibility by itself a sufficient test of scalability? 

A. No. It is the principal test, but there are at least three other features 

• Hau knecht (12) has raised this criticism earlier, also without taking cognizance 
of the fact that other criteria have always been used as discussed below. 
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that should be taken into account: (a) range of marginals, (b) random scatter of 
errors, (c) number of items in the sample. 

Further questions and answers elaborate on the point. And again, in 
another paper (7) to which Festinger refers, it is stated: 

The percent reproducibility alone is not sufficient to lead to the conclusion 
that the universe of content is scalable. The frequency of responses to each 
separate item must also be taken into account for a very simple reason. Re- 
producibility can be artificially high simply because one category in each item 
has a very high frequency. It can be proved that the reproducibility of an 
item can never be less than the largest frequency of its category, regardless of 
whether the area is scalable or not. 

And further: 

An empirical rule for judging the spuriousness of scale reproducibility has 
been adopted to be the following: No category should have more error in it 
than non-error. 

If this latter rule alone were applied to Festinger *s examples, it would 
immediately reject the hypothesis that the items are from scalable uni- 
verses. The consideration about pattern of error would also disqualify 
the hypothesis that the items were from scalable universes. 

Av Alternative. One contribution to spuriously high reproducibility 
is the fact that each item is being related to a score which is based in 
part on the item. An alternative way to compute the coefficient of re- 
producibility is to hold out each item in turn from the analysis, thus 
obtaining N sets of trial scale scores. The errors for each item can then 
be counted from its relationship to the score based on the AT— 1 other 
items. 

If this partial-score method were used on statistically independent 
items, then the reproducibility for each item would be precisely the 
relative frequency of its modal category. Thus, in Fcstinger*s example 
(1, p. 156) of five independent dichotomies with marginals 80%, 60%, 
50%, 40% , and 20% the respective modal relative frequencies are 80%, 
60%, 50%, 60%, and 80%; hence, the reproducibility of all five items, 
computed from partial scorgs, would be the mean of the latter five per- 
centages or 66%, compared with the spurious 86% Festinger obtained 
from whole scores. Indeed — no matter what the interrelations of the five 
items were — their reproducibility could not be less than 66%, because 
reproducibility of an item can never be less than its modal frequency. 
Similarly, in Festinger*s second example (1, p. 157) of nine statistically 
independent dichotomies with marginals .9, .8, .7, .6, .5, .4, .3, .2, .1, the 
respective modal proportions of the items are .9, .8, .7, .6, .5, .6, .7, .8, 
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and .9, so the reproducibility of the set cannot be less than .72; Festinj^er 
finds .83 reproducibility from whole scores whereas if part scores were 
used the obtained reproducibility would be .72. 

Items with extreme marginals like .9 and .1 do not help much in 
testing reproducibility since such items can never have more than 10% 
error. 

In practice, it docs not usually seem worthwhile to bother with par- 
tial scores, although this technique is available for doubtful cases. The 
fictitious examples of independent items do not illustrate what is to be 
expected in practice. Attitude (or achievement) items of the same gen- 
eral content are usually sufficiently correlated so that scores based on 
eleven of them will not be substantially different from scores based on 
twelve. Reproducibility from whole scores will not be much greater 
than from part scores — so their spurious excess of reproducibility over 
that from part scores can be largely ignored. Furthermore, even part- 
score reproducibility is not a sufficient test of scalability, for the ad- 
ditional criteria mentioned above must also be considered. 

There is room for more improvement on criteria for scalability when 
samples of content are used, but it should be made clear that reproduci- 
bility by itself has not and is not the sole basis for drawing inferences 
from a sample of items. It is the basic one, because the reproducibility 
of the universe is essentially what is in question, but additional criteria 
have been and are being used. 

Reliability, The suggestion that Festinger makes that the expected 
occurrence of scale responses be calculated under the assumption of a 
perfect scale p)lus a certain degree of unreliability is a promising one. 
This idea had been thought of in the earlier stages of the development 
of scale analysis but discarded in the form Festinger has suggested. The 
proportion of people with no scale errors cannot be properly calculated 
by the method that Festinger uses. Apparently he assumes that if .9 is 
the proportion of i)opulation responses that will be in the scale pattern 
for one question, then the proportion that will be jointly within the 
scale pattern for seven questions is (.9)^ or 47.8%. Unfortunately, the 
same reasoning would say that the proportion of people who have seven 
responses outside the scale pattern should be (.1)’; and in general the 
]>roportion of peoj)le wit h X scale responses and 7 —X scale errors should 
be given by the binomial distribution 
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X\{1 - X)\ 




But this is impossible, for nobody can have all his responses as scale 
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errors. Indeed, for the empirical example that Festingcr borrowed (1, 
Fig. 2, p. 157), no matter what pattern of response a person may have, 
he can be placed into one of the scale patterns with at most four errors. 
Therefore, the range of possible errors for each person is 0 through 4, 
rather than 0 through 7 as Festinger supposes. This means that Fes- 
tinger’s calculations cannot be carried out consistently to estimate re- 
producibility under the given assumption. The difficulty is that whether 
a person will fall into the scale pattern is not independent of whether 
another of his responses is within the scale pattern. Unreliability does 
not behave that way with respect to the scale pattern. 

The actual reproducibility of this example of seven questions w^as 
about .85 rather than the .9 Festinger assumed. It is interesting to note 
that (.85)’ is .32, which is not far from the “over one-fourth” perfect 
scale types reported. Actually, the universe sampled by these seven 
questions would not now be accepted as sufficiently scalalfie but would 
be broken up into sub-univenses; the study was made when 85% re- 
producibility w’as the empirical rule rather than the present 90%. The 
study did serve its purpose w^ell, however, as collateral evidence pre- 
sented there showed. 

The further calculation that Festinger makes of adding 3.7% to 
his 47.8% seems based on an unfortunate double usage of the word 
“chance.” In his second paragraph on p. 158 (1), “chance” is used to 
mean statistical independence between items. Such independence can- 
not exist simultaneously wdth the assumption of a scale pattern in his 
following paragraph; that is, the 7% who fall into perfect types under 
the hypothesis of independence of items have nothing to do wdth the 
distribution of error under the assumi)tion of uni-dimcnsionality plus 
unreliability. The binomial distribution by itself — if it w'cre correct — 
takes care of the second situation. Hence, FestingerVs calculations are 
incompatible in adding 3.7% (7% of 1 —.478) to 47.8% to obtain 52.2% 
as the “chance” proportion. The 7% is correct for independent items; 
the 47.8% w^ould be correct for the scale-plus unreliability case if the 
binomial hypothesis held ; and the two cases do not hold simultaneously. 
“Chance” means something different in each case. 

A consistent use of reliability. Several correct approaches to the 
use of the concept of unreliability are possible*, instead of the incon- 
sistent binomial approach. One such apjn*oach will be sketched here 
briefly for the case of dichotomies. Let n be the number of dichotomies 
in the “sample of items so that there are w + 1 scale types or ranks pos- 
sible. Let r be the rank of the type that is “positive” on r of the items; 
r ranges from 0 to n. Let Pr be the proportion of the population whose 



456 


LOVIS GUTTMAN 


“true” rank on the n items is r, and let Prj be the probability a person 
of “true” rank r will be “positive” in the jth item (j = l,2, . . . , 
There are 2" types of people — scale and non-scale — possible on the n 
dichotomies. The expected proportion in each of the 2" types can be 
calculated from the Pr and Prj, Conversely, from the observed 2” pro- 
portions in an actual experiment, the Pr and Prj can be estimated. There 
are »+l parameters Pr, of which n are independent since their sum 
must be unity. There are (w+l)« parameters Prj, all of which are inde- 
pendent. Hence, there are w+(w+l)w, or w(w+2), independent param- 
eters to be estimated from 2" — 1 independent observations. If n is 
greater than 5, this provides more equations than there are unknownvS — 
so the hypothesis of the scale structure can be tested, as well as having the 
parameters estimated. Unfortunately, the equations involved in the 
above analysis are curvilinear, and do not seem to lend themselves to 
practical use because of the difficulties in the numerical computations. 
Furthermore, even this analysis has been simplified by assuming that 
persons within the same “true” rank were equally reliable within each 
item. Without this simplifying assumption, the equations would have 
innumerably more parameters. 

In any analysis using the concept of test-retest reliability, it must 
be remembered that scalable data must in general be highly reliable, al- 
though the converse is not necessarily true. The coefficient of reproduci- 
bility— 'cspecitilly if computed by the part-score technique described 
above — sets a lower bound to the average reliabilities of the items (6, 
and especially 8), In particular, if items are perfectly reprod.a ^ . they 

are perfectly reliable. Hence, Festingcr errs in his assertion that “Even 
if a perfect scale were achieved these claims [concerning invariance 
properties] would all be limited by the degree of reliability ... of the 
questions asked” (1, p. 160). Perfectly scalable data are perforce per- 
fectly reliable. Conversely, highly unreliable data cannot be scalable. 
One of the contributions of a scale analysis is to provide automatically 
information about reliability by helping set a lower bound to it for each 
item. 

The simple criteria used in conjunction w^ith that of reproducibility 
for sample data do serve to distinguish between data that are highly 
scalable and those that are not. The case where the items are inde- 
pendent will always be rejected on the basis merely of the criterion of 
improvement, namely, that no category should have more errors than 
non-errors. The further criteria of studying patterns of error also tend 
to insure that no dominant second variable is present even if reproduci- 
bility is high. That is what is meant by the statement that “in imperfect 
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scales^ scale analysis picks out deviants or non-scale types for case 
studies.” If no non-scale types have substantial frequencies, then that 
tends to indicate that there is no substantial second factor j^resent. 
However, if one or more non -scale types do have a substantial frequency, 
then that is an indication of where an additional factor (or factors) is 
entering into the picture. If an additional factor is sufficiently promi- 
nent, it may be wwthwhile to try to piece it out further by asking ad- 
ditional questioiif^. The univcrsc^might be divided into two or more sub- 
universes, each of wffiich mav be scalable separately, (^r it may turn 
out that the additional factor is so highly correlated with the most domi- 
nant factor that it does not make much difference whether they arc 
treated as tw’o separate variables or as a single variable. 

The problem is not to find out wdiethcr a perfect scale is present in 
practice, but rather wdiether it is wwth worrying about any additional 
variables that may be present. The criteria used in practice are l)elicved 
to provide an answer to this and to decide properly whether or not a set 
of data can be regarded as sufficiently scalable for most practical pur- 
poses. 

Quasi-scales. One kind of non-scalable universe is called a quasiscale. 
A quasi-scale is different from a scale, not just in the reproducibility, 
but in the entire pattern of responses, h'estinger seems to have misund('r- 
stood the definition of a quasi-scale, for he seems to believe that it dif- 
fers from a scale only with respect to reproducibility^ (1, p. 156 and p. 
159). A universe which is qiiasi-scalable wall ordinarily have less than 
85% or - -'^oroduribility, but that is not its distinguishing feature. The 
distinguishing feature is the gradient in the responses to the items. Cutting 
I)oints cannot be established (as.in the case of a scale) wffiich will enable 
one to say^ that a person above the point is in one category of an item 
and a i*erson below' the point is in another category; but one can state 
that, if one person is higher than another in the quasi-scale, then liis 
probability of being in a higher category of an item is correspondingly 
greater. 

There arc many kinds of configuration w hich are less than 85% or 
90% reproducible and wdiich are not quasi-scales at all. For exami)le, 
an area may have tw^o or more dominant factors in it, in wdiich case it 
would not be either a scale or a quasi-scale. In a auasi-scale, there are 
one dominant factor and infinitely many' small factors. The ordcT of 
people in a quasi-scaie is according to the dominant factor, and is es- 
sentially invariant from sample of items to sample of items, provided 
that the samples are large enough. There is a great deal of w^ork yet to 
be done on the theory of quasi-scales, but enough is known to say that 
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they have quite a different character from scales and from other kinds 
of universes. Another distinguishing feature between a scale and a 
quasi-scale is that the scale has an intensity function and further mean- 
ingful components, whereas a quasi-scale does not have an intensity 
function or further components of that kind. 

Neurotic phenomena have been found to be quasi-scalable. For ex- 
ample, the Neuro-Psychiatric Screening Adjunct, which is the official 
paper and pencil test used at all military stations since October, 1944, 
is a quasi-scale and is a product of a rigorous investigation of efficient 
screening tests made possible by the scale analysis approach (16, 17). 

Techniques for Scale Analysis 

Scdlogram devices. There arc several alternative schemes now avail- 
able by which to carry out a scale analysis in practice. They are virtu- 
ally equivalent in terms of the results they yield, but they differ some- 
what in operation. Scalogram hoards have been the principal device 
used by the War Department, and are perhaps the most flexible and 
easiest to use. The boards are relatively simple to make and to operate; 
the cost depends upon how large a board is desired and whether or not 
a pair is to be made. If a single board is used instead of two, then the 
workmanship need not be precise and the board can be made fairly 
cheaply by any carpenter. There are alternative mechanical schemes 
that might be used instead of the wooden board, and undoubtedly 
other schemes will be invented in the future which will be even easier 
to construct. Instructions for the construction and use of a scalogram 
board will appear in the forthcoming volumes on the work of the Re- 
search Branch. 

The Cornell technique (7) is also very easy to learn; it is taught in 
a course on attitude and public opinion analysis to students who have 
no background whatsoever in statistics. For achievement tests, where 
all items are dichotomous — being marked either right or wrong — the 
Cornell technique is perhaps the best of all to be used. For dichotomies, 
there is no problem of combination of categories, so that there is but one 
trial to be made in an analysis. The Cornell technique suffers a bit in 
flexibility compared to the scalogram board when a series of trials has 
to be made. Ordinarily, but tw^o trials may be needed in an analysis, 
and the Cornell technique has proved very advantageous in such cases 
for general research purposes. It can be carried out on IBM equipment 
as well as by hand. 

'i he Goodenough technique (2) is based upon an explicit tabulation 
of all combinations of responses that actually occur. It is more “rigor- 
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ous" than the preceding two techniques in that it counts the errors at 
each stage. However, it yields no different results in the end. Appar- 
ently Festinger has not worked through the Goodenough technique to 
see how it does work out in practice.* The first step seems simple, but 
it takes a good deal of experience to master the three follow^ing steps. 
The process becomes very bulky and involved when ten or twelve items 
are used. 

The Cornell Technique has the advantage that its complexity does 
not at all change, regardless of the number of items (though of course 
the amount of labor increases with the number of items). The same 
lack of increase of complexity holds to a slightly less degree with the 
scalogram board. 

The problem of metric. The earliest technique for scale analysis was 
that of least squares (3). It is quite properly to be abandoned as a 
procedure in practice because it is certainly far more cumbersome than 
the others. However, the equations involved have turned out to be of 
basic importance in interpreting a scale, and have led in particular to the 
empirical treatment of the intensity function which is proving so vital 
for attitude and public opinion work. Also, the basic thinking behind 
the equations have led to a solution to the related proble m of paired- 
comparisons (12). 

In the beginning of my work on scale analysis, I had thought that 
one of the most important problems w as that of metric. I had thought 
that how to obtain weights for items was perhaps the leading problem 
to be solved. But as the theory of scale analysis developed, it became 
clear that the problem of weights w'^as Cvssentially a minor one for most 
practical purposes. Indeed, for the perfect scale pattern, it is easy to see 
that if scores arc to be obtained for people by adding up weights assigned 
to categories of items, then, no matter wdiat w^eights are used — ^as long 
as they have the proper rank order wdthin each item — the scores of the 
people wdll have exactly the same rank order. The ordering of people in 
this sense does not depend at all upon finding a particular weighting 
system. 

The important problem turned out to be that of finding the structure 
a universe of items must have in order to be scalable; it was not that of 
finding w^eights. 

* Festinger also apparently has misread Goodenough as to how to measure reproduc- 
tibility. Goodenough explicitly says that “at least 85% of the total number of responses 
must fall within the scale pattern, so that it is possible to reproduce 85% correctly all the 
responses of all the respondents from the scale scores" (2, p. 184). Festinger seems to 
have misread this to mean that 85% of the individuals fall into perfect scale types. 
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The problem of a metric does turn out to come into the picture for 
further problems, and it first appeared as a practical problem with re- 
spect to that of bias in questionnaire wording (11, 13, 14). The problem 
here was, after people are ranked from a high to a low on an altitude 
or opinion, to find a dividing point in the order such that the people on 
one side can be called positive and people on the other side can be called 
negative. The equations of scale analysis, when applied to the perfect 
scale pattern, show a most remarkable result. They show that a universe 
of items which is perfectly scalable can be resolved into an infinite series 
of principal components, the first of which provides the basic metric, 
ihe second of which is the intensity component, and the remaining ones 
are as yet not named (10). P^rnpirical study of the intensity function, 
has afforded for the first time a scientific solution to the problcmr of 
question bias. 

These equations, then, show that a scalable attitude is somewhat 
different from the twelve-inch ruler that Festinger uses as an analogy 
(1, p. 160). The responses of a person to items in a scalable universe arc 
seen by means of these equations to be a function of the personas metric 
score, his intensity, and the further components in the scale. The per- 
son’s rank order is sufficient to reproduce his responses exactly; in this 
sense, the responses of the population are but a function of a single 
variable. Resolving the responses into components by the alternative 
device of the least squares equations shows the responses to be a function 
of infinitely many variables, each of which is a function of the rank 
order. 

These striking results from using the least squares equations in con- 
junction with the perfect scale pattern will be elaborated on in the forth- 
coming publication on the work of the Research Branch. It might 
further be pointed out here that these equations resolve also tlu' paradox 
which a|>pears in achievement tests where the difficulty of an item seems 
to introduce a factor different from the common content factor that the 
items may have. Since scale analysis applies to achievement tests as 
well as to attitude or opinion areas, achievement tests also are resolvable 
into the principal components uf a scale. In a scalable achievement 
test, then, each item is a function of but a single dimension from the 
point of view of reproducibility, but a function of infinitely many di- 
mensions from the point of view- of principal components. The apparent 
contradiction between these two points of view is resolved by the fact 
that the infinitely many principal components in turn arc perfect func- 
tions of the rank order of people. 
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Uses of Scale Analysis 

Incidence of scales. The theory and techniques of scale analysis pro- 
vide a test of the hypothesis that a universe of qualitative items can be 
represented by a single quantitative variable. This hypothesis is ap- 
propriate for any qualitative universe obtained by any method of ob- 
servation. The universe may be a set of items recorded on a question- 
naire, or observations obtained in non-directive interviews, by partici- 
pant observation, or by any other technique of gathering data. No 
matter how the data are gathered, each observation is but a sample of 
all similar observations that could have been obtained, and the entire 
universe of observations is ordinarily of interest. 

As Festinger suggests, scalable universes may be the exception rather 
than the rule. Festinger does not give any explicit reasons for his belief, 
but this position will be substantiated in the forthcoming volume. It 
has already been pointed out that one possible reason for the existence 
of an attitude scale is that of a homogeneous culture ( 4 , p. 149), If a 
population is not subjected to the same social stimuli with respect to the 
attitude, it might be expected, that it will prove to be unscalable for 
them. The fact that neurotic j)henomena have not been found scalable 
can perhaps be explained in this fashion. Similarly, an area of achieve- 
ment may be expected not to be scalable if there is no uniform program 
of training for the population involved. 

Another reason for expecting many universes not to be scalal)le in 
practice is that the notion of a universe is so comprehensive. Each 
sub-universe of a universe is of course itself a universe. Since there is 
ordinarily a vast number of imperfectly related sub-universes, there 
must be a vast number of combinations of them which arc noii-scalable 
universes. Merely this formal consideration would lead one to believe 
that most universes are not scalable. Non-scalablc universes may of 
course be broken down in some cases into scalable sub-universes. One 
of the contiibutions of scale analysis is to point out the need for being 
clear about the universe’s content. By focusing on more and more 
homogeneous content, research can be made more meaningful and ex- 
ternal predictions be made more effective in the long run. 

The development of the above-mentioned screening test for psycho- 
neurotics (16, 17) is but one example of how research utilizing scale 
analysis was mort effective than it would have been had the more tra- 
ditional but less incisive procedures been followed. Instead of throwing 
together all kinds of conceivable predictive items into one composite, 
fifteen different universes of content were defined which might be re- 
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lated to the criterion of psychoneuroticism. The structure of each of 
these universes was first analyzed separately. Because each was found 
to be either a scale or a quasi-scale, only a relatively few items from 
each were needed in order fully to utilize the predictive power of the 
universes. The multiple correlation of the criterion was then worked 
out on all fifteen predictors with the finding that one of the universes 
predicted as well as the best combination of the fifteen. This enabled 
the short but efficient screening test to be used with the knowledge that 
it retained the predictive power of innumerably many items in fifteen 
different universes. Such a complete usage of predictive power could 
not have been made without scale analysis. 

From the practical point of view, another important feature here is 
the amount of labor saved by scalogram techniques in obtaining this 
maximum predictive power, compared to using more traditional tech- 
niques which are far more laborious and which would yield less effective 
predictions. 

The two problems, that of scalability and that of external prediction 
are distinct but related. By focusing on the scaling problem in its own 
right, more effective external predictions are thereby made possible. 

There are many areas which have been found to be scalable thus far, 
and therefore these areas can be handled economically by means of 
simple scale scores. Many areas have also been found not to be scala- 
ble ; all such areas cannot be handled so simply. It is known how to treat 
quasi-scalable areas, and Lazarsfeld is now completing a theory of the 
latent dichotomy which also can be handled by means of a single quanti- 
fication. How to utilize other kinds of non-scalablc areas is still an 
unsolved problem. The emphasis that scale analysis makes in this con- 
nection is that unless the structure of the universe is known, it is not 
known how best to treat the universe for any particular purpose. 

Distinction between theory and techniques. The basic theory of 
scale analysis is not to be confused with particular techniques for carry- 
ing out such an analysis in various kinds of situations. Festinger borders 
on confusing the two when he states that “ ‘scale analysis* seems to be an 
excellent technique for use with paper and pencil tests or other instances 
of measurement where the situation permits the inclusion of several 
questions centering about the same topic** (1, p. 160). If a research 
problem is concerned with a universe of content, then that universe 
must be studied- That is what the theory calls for. How adequate is the 
technique which Festinger implicitly advocates of studying only a single 
item from the universe? 

One of the important aspects of a universe of content is its structure; 
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for example, is the universe scalable or does it have some other kind of 
structure? The theory of scale analysis tells what a scalable structure is, 
and the various properties possessed by such a structure. 

The practical problem is to obtain information about the structure 
from only a sample of items. It has already been indicated how an ade- 
quate sample of items can be chosen to test the hypothesis of scalability. 
Furthermore, the number of items to be used in a pretest must be dis- 
tinguished from the number of items to be used in a final study. One of 
the properties of a scalable universe is that only one or two items can 
be used in a final study for many purposes once their place in the uni- 
verse is ascertained. The scalability of the universe must first be ana- 
lyzed, however, by a dozen or so items in a pretest. 

The statement that “most of those engaged in this type of research 
[public opinion] will probably find the inclusion of a scries of questions 
which could be subjected to scale analysis not feasible from practical 
considerations*’ (1, p. 159) docs not accord with what is the actual prac- 
tice both in public opinion and in market research, as well as in general 
attitude research. It is because workers in these fields are concerned 
with a universe of content that they pretest various questions on the 
same topic; it is a foolhardy pollster who bases conclusions on but a 
single question. The use of the split-ballot is evidence of this concern 
with sampling of content. In addition, ordinary polls often include 
several questions on the same topic on the same ballot. The extreme 
position taken by advocates of “open-ended interviewing” is to ask a 
whole series of questions of every respondent. And of course, conven- 
tional attitude surveys almost invariably use a substantial set of ques- 
tions for a given topic. 

It is a misapprehension to believe that asking several questions 
on the same topic necessarily creates a problem of rapport. In one sur- 
vey made of a national cross-section by a leading public opinion polling 
agency, an area of content was defined and then sampled by four ques- 
tions. Some of the interviewers complained because of the great simi- 
larity of wording of questipns. The questions were very similarly worded 
because the content concerned the size of the Navy and was very hard 
to discuss in different ways. But even under these adverse circumstan- 
ces, the analysis was successful in showing that the area was scalable and 
that the zero point could be located properly by the intensity function. 
Even more questions in the same area had been used in the pretest in 
Ithaca on a cross-section of the population there, and interestingly 
enough there was no complaint either from the respondents or from the 
interviewers, although the interviewers were no different from those 
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used in the national cross-section and had no knowledge whatsoever of 
what was involved in scale analysis. An area of apparently very similar 
questions is an exception rather than the rule. The example about 
desire for post-war schooling that Festinger has borrowed (1, p. 157) 
certainly provides no problem of rapport, and the general run of areas 
studied by public opinion polls do not present any particular problem of 
rapport. Another large market research agency has tried scale analysis 
in a routine study and has found no difficulty whatsoever with it. Be- 
cause of its simplicity and its objective solution to the problem of bias, 
this agency plans to use this approach reguLarly. 

It seems premature, then, to conclude that scale analysis cannot be 
carried out in practice in public opinion work. To the contrary, scale 
analysis is becoming more essential in this field because it affords for the 
first time a scientific solution to the basic problem of bias in public opin- 
ion polls. This problem arises from the fact that a universe of content is 
being studied and any single question is but a sample of all possible 
questions that could have been asked. How can one determine which 
question does coincide with the zero point of the entire universe, that is, 
the point which divides those who are negative on the issue from those 
who arc positive? 

The intensity function provides a scientific solution to this problem 
(13). It provides both a definition and a technique for ascertaining a 
zero point for the population. Unless some such objective approach to 
the question of bias is used in public opinion polls, it cannot be certain 
how much credence to place on their reports. 

By providing a solution to the problem of bias, scale analysis clears 
the way for asking questions in the manner which will best help estab- 
lish rapport with the respondent. The particular form of a question 
does not affect the results of scale analysis, so the research worker can 
concentrate on obtaining the wording which will make the interviewing 
work go most smoothly. Thus scale analysis has a contribution to make 
toward increasing rapport in surveys rather than the contrary. Appre- 
hension that the opposite is true seems to be due to a misconception that 
scale analysis presupposes a particular way of asking questions. 

If progress is to be made in the scientific study of attitudes, public 
opinion, and achievement, it seems necessary to concentrate on the 
problem of the structure of content. Techniques are not worth much if 
not guided by any thoery. The theory of scale analysis happens to lend 
to simple c id practical techniques. To compare these techniques with 
others, one would have to ask: what theory of structure guides the al- 
ternative techniques and how adequately is this theory served thereby? 
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NOTE ON ‘‘A REVIEW OF LEADERSHIP STUDIES WITH 
PARTICULAR REFERENCE TO MILITARY PROBLEMS''^ 

DONALD E. BAIER 
Personnel Research Section^ A,G.O. 

The valuable report* with which this note is concerned . . sum- 
marizes and reviews selected references from the available literature 
dealing with the problem of the selection of leaders in various fields. 
The primary interest in preparing the article was to provide a summary 
of techniques and results that would be of value to psychologists dealing 
with problems of selecting leaders, particularly in the military field.” 

It is the purpose of this note to make available additional facts and 
comments which appear to bear on the following conclusions of the 
reviewer: 

1. “Progress has not been made in the development of criteria of leadership 
behavior . . . .” 

2. “Advances in methodology in this field are definitely not striking." 

It is this writer’s belief that these conclusions, insofar as they are meant 
to apply to military leadership, arc not entirely warranted. 

In two reports* published by the Medical Field Research Labora- 
tory, Camp Le Jcune, N. C., research on measurement of “leadership” 
is reported. These studies indicate a substantial relationship (tetra- 
choric r==.42) between superior officers’ reports of the combat perform- 
ance of Marine Corps officers graduated from the Corps Officer Can- 
didate School and the standing of these graduates among their fellow- 
marines as indicated by a nomination procedure conducted during their 
pre-officer training. The two sets of evaluations were completely inde- 
pendent. 

An as yet unpublished follow-up study by the Personnel Research 
Section, AGO, of West Point graduates after 18 months of duty as 
Army officers also reveals a significant association (r = .Sl for Infantry 
Officers) between inter-cadet ratings or leader-nominations and success 
as an officer measured by the Officer Efficiency Report, WD, AGO Form 
67. Here again there is basis for believing that the two measures are 
independent. 

^ The opinions expressed herein are those of the author and do not necessarily rep- 
resent the official view of the War Department. 

* Jenkins, William O. A review of leadership studies with particular reference to 
military problems. Psychol. Bull., 1947, 44 , 54-79. 

• Validation of officer selection tests by means of combat proficiency ratings. Medical 
Field Research laboratory Report No. 1, January 18, 1946 and No. 2, May 16, 1946. 
Camp'Le Jeune, 
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The reviewer’s account of the research upon which are based the cur- 
rent methods for selecting wartime officers for integration into the regu- 
lar Army may result in misunderstanding. In discussing the correlation 
between the Officer Evaluation Report and the criterion of leadership, 
the latter being a product of nominations by subordinates and peers 
with a veto power resting with the commanding officer of the group, 
Jenkins states: 

. , . The degree to which the Commanding Officers* ratings were weighted 
in the Officer Evaluation Report was not stated, but it appears likely that this 
factor played an important role. Substantial agreement between ratings by 
the C.O. and by fellow officers was to be expected. Since the OER had the 
highest validity, and the other measures when combined with it increased its 
correlation with the criterion only .07, these questions suggest the necessity for 
a further examination of the nature of the criterion here employed (p. 74). 

The Officer Evaluation Report was accomplished in the majority of 
cases by the immediate supervisor, not the C.O., and represented only 
the former’s evaluation of the ratee. The conclusion that substantial 
agreement between ratings by the C.O. and by fellow officers was to be 
expected does not appear to be justified. The C.O. was only one of from 
7 to 30 nominators who participated in determining the ratee’s criterion 
standing. He had no knowledge of how the other members of the 
nomination group evaluated each ratee, and his rating was used only to 
eliminate from the criterion groups of High, Low or Middle those rare 
cases where the C.O. placed the rated officer in the opposite extreme 
from the combined ratings of his subordinates and peers. Later studies 
employing a criterion which did not include the C.O. showed no drop in 
the validity of the OER or FCL type rating device. It is our belief that 
the nomination criterion employed in the studies cited does represent 
progress in the development of leadership criteria. 

With respect to methodology, the forced choice technique as exem- 
plified in the triads and tetrads of the OER and the recently revised 
Army Officer Efficiency Report seems to deserve more attention than 
the reviewer accords it. Xhis technique, which has been described briefly 
in a paper titled “The Forced Choice Technique and Rating Scales,” 
presented at the American Psychological Association meeting in Phila- 
delphia on 5 Sept> 1946 by the Personnel Research Section, AGO, not 
only provides valid indicators of the ratees’ standing on a nomination 
criterion, but favorably influences ratings of overall competence (if they 
are rhade immediately following completion of the FCL items) so that 
they show substantially less negative skewness. Clearly, the forced- 
choice technique is effective in diminishing rater-bias and in improving 
the distribution and validity of ratings which are generally regarded as 
indicative of leadership performance. 
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Munn, Norman L. Psychology: The fundamentals of human adjustment. 

Boston: Houghton Mifflin, 1946. Pp. xviii+497. 

The importance of the introductory course in psychology cannot be 
overestimated for it determines to a great extent the student’s attitude 
toward the subject and whether or not he goes any further with it. But 
the importance of the textbook depends to a large degree upon the in- 
structor. Some instructors lean heavily upon the text, others hardly 
at all. In reviewing a book, however, evaluation must be made as if it 
were the sole source of the student’s introduction to the subject, re- 
gardless of the instructor’s predilections, interpretations, choice of ma- 
terial, or method of handling the course. Though there arc suggested 
readings at the end of each chapter in this as in other texts which the 
student is urged to consult, their influence is admittedly minor since the 
author of a text, as Munn says, writes with the feeling that he, in com- 
mon with most teachers of the subject, could “organize its topics in a 
more logical sequence, choose apter illustrations, find more interesting 
examples and . . . write a book that . . . would be more appealing to 
instructors and students than any he has seen’’ (ii). Some other re- 
quirements of a good introductory text are succinctly suggested by 
President Leonard Carmichael in his Introduction wherein he discusses 
the reasons for studying Psychology today: as an essential part of a 
general education; as preparation for professions like law, medicine, 
teaching, the ministry, and business; and for further professional work 
in the subject itself. That Munn has met both his own and the editor’s 
demands with considerable success there can be no question. The book 
is plentifully provided with excellent diagrams, half-tones, and tabu- 
lar matter; it is full of concrete material chosen from a wide variety of 
.sources; and its approach is scientific throughout. 

It would seem, in the light of these virtues, that this text meets the 
requirements of the introductory course almost to perfection. It is such 
a splendid job in so many respects that any criticism at all seems super- 
erogatory, if not hypercritical ; and yet there arc qualities expected in an 
elementary text which arc of equal or greater importance than the ones 
which this book has in such large measure. The most important lack is 
an underlying point of view or theoretical structure integrating and 
unifying the topics and their relations to each other and the subject as a 
whole. We shall find that in this respect the book is not up to its ac- 
complishments otherwise. 

The book is divided into seven main parts and these in turn into two 
or more chapters. It begins with a discussion of general, methodological 
and historical material, then proceeds to consider in turn the anatomical 
and physiological bases of behavior, learning, remembering, thinking, 
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motivation, conflict, feeling, perceiving, the special senses, statistics, in- 
telligence, aptitudes, and personality. The treatment develops by 
consideration of simpler processes followed by the more complex, in so 
far as possible, though there is some back-tracking which is done with a 
minimum of repetition of earlier material. With the general plan of the 
work before us we can now consider it more in detail. 

Beginning with the origin and scope of psychology, the first two 
chapters are devoted to a brief glance at the history of the subject 
through a consideration of such topics as the psyche, the organism, 
methods in philosophy, phyaiology, and physics, analysis of conscious- 
ness and some fields of psychology. Chapters 1 and 2 really constitute 
a single topic or set of topics and furnish an excellent survey of methods, 
fields, and problems. They are properly brief, to the point, and very 
readable. Only in one detail does the te.xt here need emendation. In 
discussing scientific controls it is stated that “there is never more than 
one independent variable in a given experiment .... If two or more fac- 
tors were varied, he (the scientist) obviously would not know which had 
produced the phenomena observed” (p. 23). While it is not expected 
that the logic of analysis of variance and designed e.xperiments should 
be presented at the elementary level (though it is not impossible) ad- 
vances in statistics have made the older Mill-Bacon canons of scientific 
procedure represented by this statement quite out-of-date. Variation of 
only a single variable in psychological experiments is possible so seldom 
as to be almost a fiction and now that we have the statistical tools for 
handling multiple variates we might as well give up the fiction. 

Part 2 deals with psychological development and consists of three 
chapters: origin and psychological significance of response systems, con- 
ception to maturity, and factors in psychological growth. Here the 
biological bases of behavior are explained and the psychological proc- 
esses most directly correlated with them arc brought in. The result is 
that the simpler and more complex processes arc more or less inter- 
mingled in these chapters as a partial list of the topics reveals: structure 
and functions of receptor and nervous systems, embryonic development, 
sensitivity, locomotion, prehension, language, gestures, writing, genes, 
heredity, environment, and maturation. The order is in general from 
simple to complex but there arc some reversals. Thus one would expect 
a discussion of genes and embryological development before discussion 
of the nervous system but here it follows the latter. The reason for 
Munn’s order is obvious to a reader of the book and a good one: dis- 
cussion of the more elementary gene units links directly with problems 
of heredity, environment, maturation and growth. There is no discus- 
sion of nerve action potentials and the treatment of the autonomic 
nervous system is postponed to the chapter on emotion where diagrams 
illustrating its relations to the ccrebro-spinal system are given. The 
reviewer finds it impossible to omit the autonomic nervous system when 
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explaining the rest of the nervous system, although like Munn, he finds 
greatest use for it in the discussion of feelings and emotions. Figure 13 
in chapter 3, showing the spinal reflex-arc system, would not be over- 
complicated if the sympathetic ganglion and its fibers were included as 
is usually done, and with some textual discussion this would remedy a 
serious omission at this point. On the other hand, Munn has included 
more material on the nervous system than is generally presented. The 
diagrams showing different types of synaptic connections make inter- 
action, facilitation, and inhibition intelligible neurologically. The dis- 
cussion of cortical representation of sensory functions is especially well 
done. 

The chapters on conditioning, learning, memory and thinking suc- 
ceed in presenting a considerable amount of material but suffer from the 
lack of clear integrating principles. While Munn rejects classical,. 
Pavlovian conditioning theory as an adequate account of all learned 
behavior it is not clear what principles he would employ instead. That 
the author places greatest reliance on trial and error, past experience, 
and association appears from his treatment of certain particular prob- 
lems rather than from explicit structuration of the material. One must 
dig his underlying approach out from a few critical cases which reveal 
the author’s fundamental position. Thus the explanation of how the 
chimpanzee reaches an apparently inaccessible object is a case in point. 
According to Munn we are to suppose ”... a chimpanzee has, in the 
jungle, learned to reach otherwise inaccessible objects by swinging 
toward them on a vine. Now in the psychological laboratory, he is 
confronted with an apparently inaccessible banana. A rope, however, 
is hanging nearby. If the animal sees the similarity between the rope and 
the vine, or between his jungle method and the one now possible, he may 
solve the problem immediately' (p. 122). (Italics are the reviewer’s.) Now 
this explanation of the ape’s accomplishment in terms of past experience 
which at first sight seems to be the scientifically simplest explanation 
actually turns out on closer scrutiny to demand much more in the way 
of memory and intellectual ability than the proposition that the animal 
simply sees the relevance of the rope to the banana which is immediately 
given. This assumption should not be too difficult to make since it was 
pointed out earlier that the difference betw’een classical (mechanical) 
and instrumental conditioning lies in the fact that conditioning oc- 
curs much more easily when in the direction of more relevant re- 
sponses. Certainly if the principle of relevance is basic to conditioning 
it may be accepted for the much more complicated case of insightful 
behavior. The wrelter of factors involved in acquiring skill and learning 
could be better ordered and made more meaningfully connected if some 
structure vere seen behind the facts in question. 

The lack of an adequate theoretical framework plagues the reader 
most in the concluding chapter on thinking. Reasoning, we are told, is 
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implicit trial and error, it is a form of controlled association, it is a com- 
bining of past experiences in order to solve problems which cannot be 
solved by mere reproduction of earlier solutions. At the same time the 
role of direction in reasoning and recall is emphasized, but how this 
factor operates with trial and error, past associations, and mere repro- 
duction of earlier solutions to problems is not faced. We are here smack 
up against the problem of organization which many workers in biology 
as well as psychology realize cannot be dealt with adequately except as 
a problem in its own right. From the reviewer’s experience such prob- 
lems cannot be evaded even in the beginning course because many stu- 
dents have already faced them in courses in philosophy, logic, biology, 
and elsewhere. 

The section on motivation of behavior seems to this reviewer to be 
best in the opening chapter dealing with physiological drives such as 
hunger, thirst, and sex where the material is largely drawn from experi- 
mental sources. The chapter on common social motives reads too much 
like a re-wording of the instinct psychology with too little use made of 
laboratory findings relevant to the topic. The chapter on conflict opens 
with sources of conflict in the environment and in the individual and 
then presents topological representation of conflict situations as an 
^‘interesting and illuminating method of representing and analyzing 
conflict situations” (p. 245). However, in the succeeding treatment of 
reactions to conflict such as compensation, identification, phantasy, 
projection, repression, and experimentally produced conflicts there is 
no further use made of Lewinian concepts. Again the integration must 
either be made by the instructor or the student will suffer from intellec- 
tual indigestion. Other alternatives are to omit topological representa- 
tion or to put it at the very end of the chapter, pointing out that some, 
much, or most of the material discussed (depending upon the degree to 
which the instructor knows topological psychology) can be diagrammed 
in these terms. The author’s penchant for trial and error pops out again 
in his recommendation of it as a possible solution in the alleviation or 
cure of conflict, contrary to the usual emphasis on rational procedures in 
psychotherapy. Since the patient admittedly knows why he is trying 
various lines of action, namely to find a way out of his conflict, it is 
doubtful if the procedure recommended is truly trial and error as Munn 
says. 

The section on feeling and emotion which follows the one on motiva- 
tion of behavior might well have preceded it, as affective states have 
been regarded by almost everyone as motivators of behavior. The main 
findings in the field are well covered with one or two exceptions. In the 
discussion of the Cannon-Bard theory the inhibitory function of the 
cortex is not mentioned and in the diagram illustrating the contrasting 
features of this theory as against the James-Lange theory the cortico- 
thalamic inhibitory path is not even shown. In view of the great im- 
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portance of the role played by the cortex in inhibiting emotion through 
positive inhibitory regulation and in allowing emotional expression 
through release of inhibition, the account offered here is entirely inade- 
quate if not misleading, as reference to Bard's exposition in the Hand- 
book oj General Experimental Psychology ^ pp. 305-307, will show. Both 
text and diagram need this aspect of the theory for a correct as well as 
complete statement. 

The following section, Knowing Our World, deals with attention and 
the special senses. Munn has here done an excellent job of boiling down 
the classical, and for the most part, stereotyped material and he has 
made it attractive by the use of well-chosen diagrams and half-tones. 
In view of the tremendous use made during the war years and now of 
material from the fields of sensation, perception, and psychophysics, not 
to mention their interrelations with sensori-motor learning, the time is 
past when we can rest content with traditional accounts of these fields. 
There is a w^ealth of material not yet in any text which modifies the 
whole approach to sensory processes and bears on every other field of 
psychology which should form part of the elementary student’s equip- 
ment. Why recent work in some fields finds its way almost at once into 
textbooks and equally important work in other fields must wait a gen- 
eration or more is hard to understand. For example, the explanation 
given of constancy is naive in the extreme in the light of not-so-recent 
work. And the epoch-making contributions by Katz find no reflection 
in the treatment of vision even though they had been in print for 35 
years at the time this book appeared. 

Several inaccuracies in terminology and fact should be corrected in 
future editions, such as: brightness and lightness should not be used in- 
terchangeably, and unless film and aperture modes are distinguished it 
is impossible to appreciate their difference; the assumption that Hering 
‘hieutral gray” is a constant or a general phenomenon rather than a 
special case is not tenable in view of work by Koffka and others; the 
discussion of retinal mixture versus overlapping of lights is so unclear 
it is impossible to determine what is meant and if it is correct; the usual 
explanation of Flor contrast as due to softening or obliteration of con- 
tours is palpably wrong and needs to be supplanted by the correct ex- 
planation given by von Bczold in the early part of the present century; 
the interchangeable use of note and tone, so common in discussions of 
hearing, should be replaced by more precise terminology in which tone 
relates to hearing-experience and note to the printed symbol. Only one 
figure of sensory qualities, the double cone for vision, is given although 
the smell prism and the taste tetrahedron arc just as good in their re- 
spective modalities. The value of 1/3 as the Weber fraction for tempera- 
ture is altogether too large to be representative following Culler’s work. 

In genetal the chapters just considered rely too much, on the 
theoretical side, on past experience and similar explanations and suffer 
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from a lack of unifying principles by which intra- and intcr-sensory 
material may be related as well as unified with other psychological 
processes. If, as admitted, principles of organization are effective in 
attending, are they not perhaps also of importance in perception, and 
taking a further step, in learning and thinking as well? Recognition of 
such principles might unify and simplify psychology for the beginner. 

The seventh and final section of the book, Individual Differences, 
contains chapters on statistics, intelligence, aptitudes, and personality. 
The chapter on statistics, k<‘pt until this section as an “Introduction to 
Statistical Analysis of Individual Differences,’' might well come earlier, 
especially for use in courses with laboratory. However the chapter can 
be introduced as it stands in almost any part of the course so its actual 
position matters little. The other three chapters form a fitting close to 
the book, entirely in the spirit of the more experimental portions in 
being packed with concrete material. Intelligence is approached from 
the historical angle and the important question of heredity and en- 
vironment is quite fully discussed. The discussion of factor analysis — 
including the fundamental factors found by Thurstone, and the illus- 
tratory material from tost batteries — make this chapter unique for an 
elementary presentation and one of the finest things in the book. 

Similarly the chapters on aptitudes and personality arc extremely 
well presented and again demonstrate the author’s ability to condense 
a large amount of fact into a relatively small compass. In the chapter on 
personality the discussion includes methods of approach such as case 
history, rating, paper and pencil tests, behavior tests, interviews, free 
association, dream analysis, and projective methods, and also physique 
and temperament, role of the endocrines, and abnormal states. The 
open, empirical treatment here is more acceptable because the subject 
is more familiar to the average student and personality as a concept 
already provides some structuration by which its data can be ordered. 

Taking the book as a whole, what arc its pros and cons? On the plus 
side it is an excellent text in so far as it provides a wealth of concrete 
factual data chosen from widely different sources both within and out- 
side psychology proper. With some exceptions it represents present-day 
scientific psychology very fairly. The student should come aways from 
this text with respect for tht scientific approach not because he has been 
told that it works best in various fields but because he has found that 
material obtained by scientific methods can be applied to many different 
life situations and leads to further fruitful discovery. If the author is 
unable to accept a theory in toto his criticism is so mild and fair that the 
student’s respect for the theory as well as for psychology in general is in 
no wise diminished. This and the catholicity of Munn’s approach should 
exert a very good effect on coming generations of psychologists. Too 
often personal or institutional loyalties lead even graduate students to 
belittle men and work done outside their own bailiwicks with effects 
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detrimental both to themselves and to psychology. This book should 
serve as an excellent corrective to this tendency. 

On the negative side of an otherwise fine piece of writing and presen- 
tation must be noted the lack of an integrating, and unifying point of 
view which has been pointed out in our previous discussion. This lack 
results in a looser and more disjointed treatment than is necessary in the 
light of present advances along various fronts. This is not meant to* 
imply that M unn himself does not have a point of view. As we have seen, 
careful reading reveals that for him trial and error is the great principle 
operating in human behavior and a number of indications are present 
that he believes in what has been dubbed an '‘atomistic logic,” i.c., 
proceeding from “simples” to “complexes.” But having brought into 
his discussion of conditioning the principle of relevance, into thinking 
the principle of direction, into some sensory experiences primitive or- ’ 
ganization, and having recognized other whole-properties as well, he 
is under obligation to apply them more generally where they are appli- 
cable or at least to square them with the fundamental principles he 
believes are operative. Perhaps he has done this and this reviewer has 
missed it. If so, then it is probable that most students will fail to see 
how it all goes together. 

The tendency to over-simplify has already been pointed out with 
reference to certain neural diagrams but it occurs much more frequently 
in the textual discussion where it leads the author, in making points 
which are quite valid in themselves, to say things he cannot possibly 
mean as they stand. For example, in writing about the influence of 
animal experiments on the theoretical basis of psychology, there ap- 
pears the remarkable statement: “After all, learning is learning and vi- 
sion is vision whether it occurs in man or animal” (p. 10). But the vision 
of the most widely used laboratory animal, the white rat, is very differ- 
ent from that of man, from retina to higher cortical centers, and Munn 
later points out that “Insight is rare in animals, not quite so rare in 
children, and quite common in human adults” (p. 109), meaning to 
distinguish among kinds of learning. One finds too many statements 
like this which take a good deal of explaining to mitigate. 

There can be little doubt the present book will set a pattern for 
future introductory texts. The double columns while providing a shorter 
reading line and more words per page also make possible wider spaces 
for illustrative material and marginal notes. The wealth of charts, 
diagrams, and pictures lessens the instructor’s blackboard work and 
should prove a boon to places where laboratory work cannot be given. 
In this text psychology appears as a positive, if not positivistic, science. 
If it were possible to combine what Munn has done with more emphasis 
on methods and unifying principles we should be much nearer the per- 
fect presentation of present-day psychology everyone desires. 

Harry Helson* 


Bryn Mawr College. 
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Bridges, J. W. Psychology normal and abnormal, Toronto: Sir Isaac 

Pitman & Sons, 1946. Pp. xviii+470. 

Except for dropping the chapter on philosophical foundations, the 
splitting and partial revision of the chapter on reflexes and instincts, and 
the annotation of the extensive bibliography, the 1946 edition has 
‘‘identical twin"' resemblance to its 1930 predecessor. The appearance 
X)f the revision does, however, call attention again to a book with a 
classical timelessness of integration (despite an eclectic tolerance), 
stimulating hypotheses, and a style reminiscent of William James. The 
general reader and even the professional psychologist will find interest 
and value here, although the book was designed for the introductory 
psychology course of pre-medical and medical students. 

Bridges chooses to give the distilled essence of a topic rather than to 
lead the reader to a conclusion from the raw data of experiments and 
case studies. The few graphs, tables, and other reference to specific 
studies are illustrative only. Just as the dramatist’s words furnished the 
bare Elizabethan stage, the emphasis on the logic of the argument in 
this book seems to stimulate more associations and imagery than one 
gets from many texts replete with illustrations. Many more of the quo- 
tations are from English and French psychologists than one meets in 
most American texts. 

The chapter headings might have come from any of a number of 
general psychologies, but the plan of devoting the first half of each 
chapter to normal behavior and the remainder to the related abnormali- 
ties makes a distinctive pattern throughout the book. Technical words 
are italicized and well defined. The chapter on applied psychology 
delimits that field with such precision and perspective that it ought to 
be widely reprinted. 

An error not corrected from the 1930 edition is the taking of the 
standard deviation from the median. Emphasis on the older studies, 
e.g., Downey will-temperament tests, is heavier than on those of the 
last two decades. 

The problem of how to teach psychology in medical schools or to 
pre-medical students seems to have led in at least three main directions. 
Some have emphasized a sociological-psychological approach, as in 
Pressey’s Life, since this is a common medical blind spot. Others have 
stressed the genetic-psychosomatic attitude, as found in such authors as 
Maslow and Mittlcmann. A third group believe that medical students 
are more highly motivated and gain more insight from the contrasts and 
comparisons of the normal and abnormal. Bridges from his experience 
as the first professor of abnormal psychology on a medical faculty has 
provided an effective text for the last group. 

Georgb' M. Haslerud. 

University of New Hampshire. 
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Gray, J. S. Psychology in human affairs. New York: McGraw-Hill, 
1946. Pp. viii + 646. 

While this book is, in many respects, a successor to Gray’s pre- 
viously edited text. Psychology in Use (American Book Co., 1941), in 
that it discusses the applications of psychology to the main fields of 
practical life, and represents the co-authorship of eleven other con- 
tributors, nevertheless it is not merely a revision. With two excep- 
tions, the co-authors are new. They are less well known than those of 
the former book, but Gray has himself taken a more active part in the 
actual writing of the text. Several chapters appear for the first time, 
such as '‘Psychology in Speech Correction,” “Psychology in Music, 
Art, and Literature,” and “Psychology in Military Affairs.” Others 
appear under new titles, and are written from a new viewpoint. 

Perhaps the outstanding characteristic of the book is its emphasis on 
factual material. For example. Chapter II, on “Psychology in College 
Life” contains twenty tables and four graphs. Chapter III, on “Child 
Development” contains ten graphs and twenty-one tables. Much of 
this material is new to textbooks, and with few exceptions, the refer- 
ences arc to studies published after 1930. The general effect of this 
emphasis on experimental data and practical findings is to require a 
change in teaching methods on the instructor’s part. His function is no 
longer to supplement the text with up-to-date illustrative material, but 
rather to interpret and evaluate that which is given. Less supplemen- 
tary assigned reading is needed, and much more digesting of the text by 
the student. The art of reading and interpreting tables and graphs is 
one which requires special training. Many students arc allergic to sta- 
tistics, though this is not in itself an argument for using them sparingly, 
if the instructor is competent to vitalize them. But the chief value of 
facts is to illustrate and support laws, principles and theoretical formu- 
lations. They arc most effectively used in the inductive development of 
a topic. Most of these facts are and should be promptly forgotten by the 
student, so that his memory is freed for the permanent retention of the 
principles. The immature student needs much expert guidance in rec- 
ognizing the bare essentials of fact to be learned. While this book is 
many strides ahead of the type which presents only unsupported asser- 
tions, or illustrations selected for their patness only, does it err slightly 
in the opposite direction? 

Another important characteristic of the book is its emphasis on the 
practical. This is to be expected in an applied psychology text, but is 
seldom achieved. Omnibus books too often give an impression of sketchy 
remoteness, with little practical contact, while technical treatises 
are writUn for the advanced student who wants specialized information. 
This book, by achieving a compromise between these two extremes and 
by emphasizing the practical aspects of each field for the layman, fills a 
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real need. Its range of topics is wider, its treatment more complete than 
is customary in such books. 

In spite of its up-to-dateness, the book occasionally presents old 
data which have been superseded, or theory which is now modified. In 
one or two cases, quite erroneous statements appear, such as the follow- 
ing, in connection with a discussion of the topic of I.Q. constancy, on 
pages 91“92: 

If the child develops mentally at exactly the same rate as other children 
tested, his I.Q. will remain constant. However, if his mental development is 
faster than that of other children, his I.Q. will increase. Likewise, if his mental 
development is slower than that of other children, his I.Q. will decrease. 

The author seems to have confused constancy oi I.Q. with normality of 
I.Q., for if the statement were taken as it stands, it would mean that no 
child’s I.Q. is constant if he has a faster or slower developmental rate 
than the average child. 

An innovation in this book which probably has pedagogical value 
and would be used oftener by authors of texts if publishers would let 
them, is the table of contents at the beginning of each chapter. The 
addition of page numbers would increase its value. A word must be 
said in criticism of certain reproduced charts, in which the reduction in 
size of print needed to get them on the page has made them unreadable; 
for example, those on pp. 474 and 573. Otherwise the style of the book 
is good. 

Many psychology teachers will welcome this book cither as a supple- 
ment to the general course in psychology, or as a second course to follow 
the introductory one. Those who teach adult extension classes will find 
it an excellent survey text, both meaty and comprehensive. 

Arthur G. Bills. 

University of Cincinnati, 

Luck, J. M., & Hall, V. E. (Eds.). Annual review of physiology (Vols. 

VII & VIII). Stanford Univ. P.O.: Annual Reviews, Inc. and 

American Physiological Society, 1945, 1946. Pp. vi + 774 (Vol. 
VII). Pp.vi+658 (Vol. VIII). 

These arc Volumes VII and VIII of the annual scries begun in 1937 
and published jointly by the American Physiological Society and An- 
nual Reviews. It is the declared editorial policy of the Review that “en- 
couragement is given only to preparation of reviews which survey the 
important contributions of the preceding year or biennium, which ap- 
praise them critically and evaluate with discrimination the present 
status of the subject. Comprehensive reviews in which the task of the 
author is one of compilation rather than of appraisal are deliberately 
eschewed.” Despite this policy, some of the reviews arc principally 
compilations or annotated bibliographies. And some are rather spotty 
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compilations mixed with evaluation, while only a few reach the goal of 
really critical reviews of the literature. On the whole, however, the 
reader can obtain a picture of the more significant aspects of research 
progress in the respective fields covered by the review. 

Volume VII contains 26 chapters written by 30 authors, and Volume 
VIII contains 25 chapters by a total of 29 authors. In each case, 
bibliographies of literature cited in the various reviews total about 4,000 
references. At the end of each volume is an author index of about 4,000 
items and a subject index of about 40 pages in length. 

Because the reviews are written by physiologists for physiologists, 
more than half of each volume is of no interest to the psychologist except 
that occasionally there is a brief treatment in a sentence or two of some 
psycho-physiological problem. The psychologist, however, who wants 
to find out what has happened recently in some special aspect *of 
physiology relevant to his field of teaching or research will find that 
his best bet is to go to these volumes and to consult the excellent author 
and subject indices before resorting to other less up-to-date textbooks 
or to more laborious methods of library research. 

More than that, however, many of the special chapters in physiology 
are good reading for psychologists engaged in the respectively related 
field of psychology: for genetic psychology. Physiological aspects of 
genetics VII and VIII) and Developmental physiology (VII and VIII); 
for sensory psychology. The special senses (VII) and Audition (VIII); 
for neural mechanisms of behavior. Electrical activity of the brain (VI I), 
Conduction and synaptic transmission in the nervous system (VII), Nerve 
and synaptic conduction (VIII), The visceral functions of the nervous 
system (VII and VIII); and for a general review of physiological psy- 
chology, Physiological psychology (VII and VIII). 

The contrast between the chapters on physiological psychology in 
the two volumes deserves a special comment. In Volume VII, Stone 
has presented a careful and critical review. He has covered thoroughly 
the recent literature and has appraised its strength and shortcomings so 
that the reader can sec what has happened and what it means for phys- 
iological psychology. 

The same chapter in Volume VIII by Seashore, however, does 
neither of these two things. It opens with a philosophical discussion of 
the mind-body problem and of the scientific approach to it. The chapter 
then proceeds to summarize the present status of individual differences 
in skills, abilities, aptitudes and capacities. Finally, it gives a general 
summary of the effects of extreme working conditions upon the effec- 
tiveness of human performance. Thus Seashore’s chapter spends a lot 
of time on problems which are not physiological psychology, in any 
reasonable definition of the field; and he fails to review or appraise the 
recent literature in the field. As a consequence, the physiologist reading 
the two chapters is likely to be bewildered by two so very different con- 
cepts of physiological psychology. 
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Looked at in perspective, these two volumes of Annual Review of 
Physiology^ like previous volumes in the series, are an extremely valu- 
able aid, not only to physiologists, but to all those for whom physiology 
is an important ancillary subject. By and large, the chapters give schol- 
arly up-to-date appraisals of their respective fields. As he has stated 
before, this reviewer feels that a companion volume, giving an annual 
review of psychology, would be an invaluable aid to psychologists, which 
would help us ‘‘keep up with the literature*' and give us better perspec- 
tive on the developments in our field. 

Clifford T. Morgan. 

The Johns Hopkins University, 

Barker, Roger G., Wright, Beatrice A., and Gonick, Mollie R. 

Adjustment to physical handicap and illness: a survey of the social 

psychology of physique and disability. New York: Social Science 

Research Council, 1946. (Bulletin No. 55.) Pp. xi-f 372. 

This is another in the excellent series of research summaries spon- 
sored by the Social Science Research Council. The authors have earned 
special commendation by providing intelligently critical comments as 
to the assumptions and thinking of earlier investigators, rather than 
merely reporting data and conclusions; by writing this summary of 
prior research into a theoretical frame of reference (topological psy- 
chology); and by introducing some well-chosen original material to 
illuminate the inferences drawn from published sources. 

From their survey the authors have eliminated somato-psychologi- 
cal studies of age, sex, race, and speech defects, on the ground that these 
have recently been covered adequately by other reviewers. Leprosy is 
discarded as a minor problem in the western world. Of the remaining 
areas, detailed reports arc presented on: normal variations in physique; 
crippling; tubercular conditions; impaired hearing; and acute illness. 
Bibliographies are added on: visual disability; cardiac conditions; 
diabetes mellitus; cosmetic defect; rheumatism; and cancer. 

The least satisfactory chapter is that on variations in normal phy- 
sique. There is a good section on size changes at adolescence, but the 
discussion of variations in adult size is rather elementary, and no men- 
tion whatever is made of Sheldon’s The varieties of human physique and 
The varieties of temperament. Even if one docs not accept Sheldon’s 
theory, his work can hardly be ignored. The authors occasionally dip a 
hesitant toe into the cold waters of endocrinology, genetics and autonomic 
nervous function, then withdraw hastily. Clarity would have been im- 
proved by frankly excluding such material. 

Outstanding treatments of crippling, tubercular conditions and 
impaired hearing more than atone for any shortcomings of the earlier 
chapter. Particularly interesting is the mode of analysis in terms of 
overlapping situations, A disabled person is able to function on a par 
with normals in some environments; he is decisively barred from other 
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situations; but between these extremes will fall a range of ambiguous 
conditions in which the handicapped person may participate, but under 
difficulties. The Lewinian concepts of barrier, valence, potency and 
congruence are used fruitfully to show basic similarities between the 
situations facing the? orthopedic cripple, the tubercular, the deaf and the 
individual with acute illness. 

The person with impaired hearing, for example, functions in many 
situations unnoticed by his normal associates. If the behavior involved 
does not require auditory controls, he may compete on equal terms. 
Where hearing is involved, he may be handicapped and subject to extra 
criticivsm, since his impairment is not obvious and many normals (e.g., 
school teachers) fail to make allowances for it. The barriers in his field 
are indefinite (as contrasted with the orthopedic cripple, for example, to 
whom certain activities are plainly impossible), and this condition often 
gives rise to vacillation and instability. The valence of full-normal ac- 
tivity is positive and high, but the valence of failure and criticism is 
negative and high. I'hus physically handicapped persons are likely to 
show the familiar symptoms of conflict. 

We occasionally feel, in these topological analyses, that there is an 
unstated shift from the topology of the external situation (geographical 
environment) to the situation as perceived by the individual (behavioral 
environment). In the case of cripples, for example, some activities arc 
objectively impossible, whereas others are subjectively considered to be 
impossible. It is clear that these two should not be treated as identical, 
and yet that impression is sometimes given. If the entire analysis were 
erected on a perceptual basis, this uncertainty could have been avoided. 

The importance of the iridividual’s perception of his defect, and of 
his behavioral field, is well illustrated by the discussion of family 
attitudes toward the handicapped. Many parents reject the handi- 
capped child, while others over-protect him and keep him in an infantile 
status. Optimum adjustment seems to be achieved when the parents 
adopt an understanding, objective attitude which focuses the child’s 
attention on realistic assessment of the situation. Excessive sympathy 
and pity are likely to encourage exaggeration of barriers and exclusion 
of many possibilities for normal participation. 

The final chapter on employment of disabled persons gives a realis- 
tic and well-considered treatment of this problem, the solution of which 
is basic to optimum adjustment of handicapped adults. 

Ross Stagner. 

Dartmouth College. 

Lewis, Claudia. Children of the Cumberland. New York: Columbia 

Univ. Press, 1946. Pp. xviii-f 217. 

Before going to the Southern highlands, Miss Lewis was a teacher in 
the Harriet Johnson Nursery School in Greenwich Village, 
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City. She compares the behavior of the children in the nursery school 
Which she established in the mountains of Tennessee with the behavior 
of her Greenwich Village pupils. The majority of the material presented 

the book concerns the mountaineer subjects. 

A considerable part of the volume consists of a collection of incidents 
involving child care or child behavior. These range in length from single 
',entences to a page or more, in age of subjects from birth to senility, in 
■orm from dialogue to descriptive essays. They are extremely readable 
and serve to render very vivid the life of the mountain people. 

Miss Lewis does not attempt to present quantitative measures, but 
"or this she cannot be reprimanded. It is apparent that she devoted 
, "ery full days to the nursery school, and that research had a secondary 
ilace. Nevertheless, her thinking is quantitative. She emphasizes the 
diversity of individual behavior which takes place in both schools, and 
makes clear that there is overlapping between the schools. However, it 
is her belief that there are large differences in central tendencies be- 
tween her two kinds of subjects. With this contention, it is likely that 
nearly every person who is familiar with both types of children will 
agree. 

Miss Lewis finds more spontaneity, more energy, more conflict, more 
aggression in the New York group. The mountain children are more 
placid, more compliant, more quiet, and in some respects, better ad- 
justed, 

I She docs not find the explanation of these differences in any one 
factor. Among the probable causes mentioned are the following: the 
differences in spaciousness of the environment, differences in climate, 
(health, and nutrition, differences in sleeping habits, differences in infant 
care and family structure, differences in discipline, and differences in 
Environmental stimulation. 

Throughout the book. Miss Lewis shows an excellent understanding 
bf child development in both New York and Tennessee — not an easy 
achievement. She also displays a high degree of ability as a writer. This 
combination of traits makes her book one of the best in the “child in 
a culture” field. It should be profitable alike to teacher, f)ai*ent, psy- 
chologist, and sociologist. 

The reader may sample for himself some of Miss Lewis’s attitudes 
and style in the following quotation from her concluding chapter: 

No, now that this study is made I am not packing my trunks with the in- 
tent of moving down to Tennessee, building me a cabin, taking to the “simple 
life” anii rearing my hypothetical children in the way the Summerville families 
io. For us it is not a question of attempting to turn the clock back in that way, 
which, indeed, would be as impossible as it would be undesirable. It is rather a 
question of trying to bring to Greenwich Village a little more of . . . Summer- 
ville life .... 

Wayne Dennis. 

University of Pittsburgh, 
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Leeper, Robert. Psychology of personality, Ann Arbor: J. W. Edwards, 
1946. Pp. 167. 

The format of this book, resembling that of a laboratory manual or 
workbook, may lead many to overlook this significant treatment of 
personality and mental hygiene. 

The author’s organization apparently proceeds from an assumption 
which has increasingly impressed itself on the reviewer in recent years; 
namely, that, in order to have functional significance, any treatment of 
mental hygiene must be based on a consistent theory of personality 
processes. Moreover, such a treatment should not be left among the 
author’s unverbalized potentialities, but should be given systematic 
formulation. It is to this task that the major emphasis of Leeper’s book 
is devoted. 

Leeper’s treatment is mainly an elaboration of the following thesis: 

In general, the term “personality** covers three things: (1) the person*s 
motives, and especially his emotional motives, or ways in which he responds 
emotionally in different life situations; (2) the general techniques by which, 
characteristically, he tries to attain satisfaction for these motives; and (3) the 
background of meanings or pictures of reality which determine the motives and 
types of adjustive responses of the person (p. 5), 

The discussion of motivation distinguishes between physiological 
and emotional motives and between positive can negative emotional 
motives. While the latter distinction seems arbitrary, since the ‘‘nega- 
tive” motives can be regarded as the products of frustration of the 
“positive” motives, it serves a useful expository purpose when the au- 
thor deals with motivational differences existing between well adjusted 
and poorly adjusted personalities. 

The techniques by which the person tries to attain satisfaction for 
his motives are considered from two points of view: (1) the nature of the 
learning processes, and (2) the description of effectual and ineffectual 
adjustment techniques. The learning processes are treated with due 
attention to the dynamic complexities recognized in modern learning 
theory. In addition to describing the usual techniques employed by 
maladjusted personalities, the author discusses some of the major 
techniques by which superior personalities distinguish themselves. 

In his discussion of the “background of meanings,” the author deals 
competently with an aspect of behavior which seldom receives the em- 
phasis merited by its significance for personality dynamics. Leeper’s 
thesis is that “ . . . a person cannot govern his behavior just by what 
is objectively and actually true, but . . . must forever live and react in 
terms of properties which he infers as existing because of his experience 
in previous situations” (p. 92). This view that behavior is determined 
not as much by the character of objective reality as by the individual’s 
interpretation of reality (through the phenomenon of emotional trans- 
ference) is supported in terms of the principle of equivalence of stimuli 
and the principle of substitute response or displacement. Treated in 
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these terms, the “background of meanings’* is seen to be an aspect of 
personality whose importance has been emphasized by such widely 
separated disciplines as the research of animal psychologists and the 
clinical observations of psychoanalysts. 

It is the reviewer’s opinion that through the medium of Leeper’s 
book the principles of personality functioning are made understandable 
to the average undergraduate without undue simplification or loss in 
organic quality. For this reason, it seems regrettable that the book was 
.not produced in a form more likely to have wide distribution. 

The book will probably be disappointing -to those who feel that a 
textbook should serve as a compendium of psychological research find- 
iings. While the author draws freely upon research sources, these tend 
to lose their distinctive identities in the author’s discussion. No bibliog- 
raphy or index is provided. 

Bert R. Sappenfield. 

Montana State University. 

Kelley, Douglas M. 22 cells in Nuremberg. New York: Greenberg, 

1947. Pp. 245. $3.00. 

The author of this book was for five months the official psychiatrist 
at the Nuremberg prison and in that capacity made psychiatric exami- 
nations af all the 22 top-ranking Nazi prisoners. The customary medical 
and psychiatric procedures were supplemented by Rorschach personal- 
ity tests and Wechsler-Bellevue intelligence tests given by the author’s 
^fellow officer, Capt. G. M. Gilbert. The examinations and tests were 
further supplemented by information obtained from former intimate 
associates of the accused and from motion pictures, speeches, writings, 
and other records. 

Except for Rudolf Hess and Hermann Gocring, this was the first 
psychiatric study to be made of any of the accused. In view of the fact 
that twelve of the group are no longer living and that all but three of the 
others were disposed of by prison sentences of from ten years to life, the 
documents and conclusions of Dr. Kelley are destined to be of lasting 
historical interest. 

The task which the author set himself was not merely or chiefly to 
determine the degree of mental responsibility of the subjects, but rather 
to investigate their basic personality patterns. He wanted to find out 
what these men were like who had made themselves masters of eighty 
million people, and what factors in childhood, youth, and later years 
had made them what they were. The book attempts to answer these 
questions in language sufficiently nontechnical to be intelligible to 
readers who are relatively unfamiliar with esoteric theories of personal- 
ity. We are informed that more detailed reports of the work will be 
published later in professional journals, and that transcripts of the in- 
terviews and other records will ultimately be available to historians. 

Among those to whom most space is devoted are Goering (27 pages), 
Hess (22 pages), Ley (21 pages), Rosenberg (13 pages), and Streicher 
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(11 pages). The other 17 members of the group get from three to eight ; 
pages each. By good luck the examination of Ley had been completed 
before he committed suicide, and we are told that the post-mortem 
examination of his brain confirmed the psychiatric diagnosis. 

With the exception of one chapter, the book is concerned entirely 
with the 22 Nazis who were studied first-hand by the author. The 
additional chapter presents a 35-page portrait of Hitler based on infor- 
mation and comments obtained from the Fuhrer’s contemporaries, his 
aides, his personal physicians, and his secretaries. Some of "this infor- 
mation is new, and the author’s interpretations differ in several impor- 
tant respects from those which have been current. 

Within the limits of a brief review it is not possible to summarize the 
- author’s interpretations of the individual subjects. In fact, each por- 
trait as sketched is a unified gestalt that almost defies further condensa- 
tion. The sitters for these portraits composed indeed a motley group. 
They ranged from the stupid to the highly intelligent; from the semi- 
insane to the stable and well integrated ; from the shrewd and talented 
leader to the errand-boy hanger-on seeking in Hitler a father surrogate. 
But there were three characteristics which they had in common* in- 
ordinate ambition, debased ethical standards, and a hyperdeveloped 
nationalism that justified anything done in the name of Germandom — 
plus, of course, an economic and political environment that allowed full 
play to their ruthless wills. 

The author’s conclusion is that Nazism was a ‘'socio-cultural''\; 
ease,” epidemic among our enemies but endemic everywhere. He tells 
us that the Nazi leaders were not the rare and spectacular types that 
can be expected to appear only once in a century. Instead, neurotics 
like Hitler, with "hysterical disorders and obsessive complaints, can be 
found in any psychiatric clinic.” Similar ones, thwarted and discour- 
aged, but determined to do great deeds, roam the streets of every Ameri- 
can city. "Strong, dominant, aggressive, and egocentric personalities 
like Goering . . . can be found anywhere — behind big desks deciding 
big affairs as businessmen, politicians, and racketeers.” We hardly need 
to be r(‘minded that men strongly resembling some of these types oc- 
casionally win election to our highest law-making bodies or to the gover- 
norship of a great state. 

Dr. Kelley has analyzed for us 22 types of totalitarian-virus, has 
described the soil in wliich they thrive, and has indicated some of the 
means by whic h society can protect itself against them. His book will 
inevitably be compared with one written by another psychiatrist — 
Brickner’s Is Germany Incurable? Of the two, the reviewer finds Kel- 
ley’s less controversial and no less challenging.* 

Lewis M. Terman. 

Stanford University, 

* Since this review was written, Nuremberg Diary, by Capt. G. M. Gilbert, has been 
published. This book should be read along with Dr. Kelley's. L.M.T. 
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